2. hypotheses and comparisons Flashcards
(25 cards)
why do we need comparisons and hypotheses in political reaserch?
In political science—especially when we study international relations (IR)—we’re often asking “why” questions:
Why do some countries go to war while others don’t?
Why do trade agreements succeed in some regions but fail in others?
To answer these questions scientifically, we need to build and test theories.
def and utility (in pol sc) of a theory
A theory is a structured way to explain how and why things happen.
In political science, theories help us:
Identify important factors (e.g. power, interests, institutions)
Understand causal relationships (e.g. Does more democracy reduce the likelihood of war?)
Propose explanations that we can then test with real-world data
➡️ So, theories give us hypotheses (specific, testable claims) that we can compare using evidence.
def comparisons
how we spot patterns (e.g. comparing democratic vs authoritarian regimes).
def hypotheses and how to make it useful
specific prediction derived from a theory (e.g. If a state is democratic, it is less likely to go to war with another democracy)
->To make hypotheses useful, we must:
Define our concepts clearly (what do we mean by “democracy”?)
Measure them reliably (how do we measure “likelihood of war”?)
what is science according to Popper
true scientist is not someone who has all the answers, but someone who keeps questioning, testing, and trying to disprove ideas.
-> can never truly say it is proven or verified
- can say it has not yet been falsified
🧪 Falsifiability:
A scientific statement must be falsifiable. That means:
It must be possible to imagine evidence that would prove it wrong.
Examples:
✔️ “Countries with more trade are less likely to go to war” – testable ✅
❌ “Peace is caused by good vibes” – not testable ❌
science according to Clark, Willam Roberts, Matt Golder, Sona Nadenichek Golder
Must be testable that could falsify them to be wrong/right
- if not falsified -> accepted as attempts to truth
=> vS Popper: we buy into theories/ knowledge because we try to test and prove that they could be wrong
explain “all models are wrong, but some are useful” from George Box (+ his def of a model)
no model can capture every detail of reality—especially in the complex world of international politics. But that’s okay!
A model is:
- simplified, abstract representation of some larger and more complicated subject
- used to convey the essential features of a theory about political behaviour
def causal theory
Essentially a coherent story that explains how and why something happens
theory helps us what has happened in the past and predict what will happen in the future
We use variables to express these ideas:
Type of Variable_Role in the Theory_Example
-> Independent Variable (IV)_The cause_Level of democracy in a country
-> Dependent Variable (DV)_The effect_Likelihood of war with neighbors
🧩 A good causal theory tells us why a change in the IV causes a change in the DV.
explin: The type of comparison depends on the measurement level of the IV and DV
IV Type - DV Type - Method *Categorical - Categorical - Cross-tabulation (% in each category) |
*Categorical - Interval (e.g., mean score) - Controlled mean comparison
*Interval - Interval - Correlation or regression
What Makes a Good Explanation (or Theory)?
- causal mechanism (A good explanation identifies how the cause leads to the effect)
2; A Good Explanation Is Causal - hypotheses (how you test theories)
- making comparisons
explain a good explenation is causal as a component of a good explenation
A good theory doesn’t just describe what’s happening—it explains why it happens. It connects:
A dependent variable (DV) – the outcome or effect you’re trying to explain
To one or more independent variables (IVs) – the causes or influences
🔄 “X causes Y because…” – that’s the heart of causal explanation.
EX:
poor theory: Why do some people support increasing the Social Security budget while others don’t?
Here, the dependent variable is: opinion about the Social Security budget.
->✖️ Poor Explanation (Tautology):
People support it because they think we should spend more on it.
This isn’t helpful—it’s circular and non-causal.
🤏 Slightly Better:
Democrats and Republicans have different opinions on Social Security.
Okay, this points toward a causal factor (party affiliation), but it’s too vague. How and why does partisanship shape those views?
✅ Much Better Explanation:
Party identification shapes people’s views because partisanship forms early, often through parental influence. Later, citizens look to their party’s leaders for cues. Since Democratic leaders tend to support social programs, Democrats are more likely to support Social Security spending.
This is a causal process, and it’s testable. It shows:
A clear link between IV and DV
A mechanism (how partisanship influences views)
A plausible, research-based story
explain hypotheses as a component of a good explenation
A hypothesis is a testable statement about the relationship between IV and DV.
Good hypothesis format:
In a comparison of [units of analysis], those with [a value on IV] will be more likely to have [a value on DV] than those with [a different value on IV].
🔁 For every research hypothesis, there’s a null hypothesis:
It says there is no relationship between the IV and DV.
You test whether you can reject this null hypothesis with evidence.
explain making comparisons as a component of a good explenation
Hypotheses suggest comparisons. They imply a research design.
Testing depends on variable types:
IV Type | DV Type | Method
Categorical | Categorical | Cross-tabulation
Categorical | Interval | Mean comparison (ex ANOVA if comparison across 3 or more categories; t-test…)
*categorical= nominal or ordinal
def categorical variable
(also called Qualitative Variables):
These variables represent categories or groups.
They can be divided into distinct categories that don’t have a meaningful order or ranking (nominal variables), or they can have a specific order (ordinal variables).
def numerical variable
(also called Quantitative Variables):
These variables represent measurable quantities and can be expressed numerically.
Interval variables
Ratio variables
Discrete variables: Countable values, typically integers. Examples: number of children, number of cars.
Continuous variables: Can take any value within a range. Examples: height, weight, temperature.
what are the cross-tabulations and 3 rules
- table that examines the relationship between two categorical variables by showing how the categories of one variable relate to the categories of another (frequency and percentage)
- If both IV and DV are categorical variables
rules:
1/ IV defines the columns, DV the rows; raw frequencies in the cells, totaled at the
bottom of each column
2/ always calculate the percentages by categories of IV
3/ compare percentages with a given value of DV
when do we use mean comparison table
A table that shows the mean of a dependent variable for cases that have different values on an independent variable
- used when DV is measured at the interval level and IV at the nominal or ordinal
level
📐 Example Hypothesis:
“In a comparison of countries, those having higher per capita GDP will ratify more international environmental treaties than countries with lower GDP.”
📋 What It Does:
* It shows the average (mean) value of the DV for each group defined by the IV.
* This allows us to compare group means and test whether the IV is associated with differences in the DV.
as a summary, describe those concepts:
1 Hypothesis
2 Null Hypothesis
3 IV (Independent Variable)
4 DV (Dependent Variable)
5 Causal Mechanism
6 Intervening Variable
7 Cross-tabulation
8 Mean comparison
1 A testable claim about the relationship between IV and DV
2 Asserts that there is no relationship between IV and DV
3 The presumed cause
4 The presumed effect
5 Explains how and why the IV affects the DV
6 A variable that lies in between the IV and DV, helping to explain the process
7 Use when both IV and DV are categorical
8 Use when DV is interval-level and IV is categorical
pos and neg relationship btwn variables
ONLY FOR ordinal, interval and ratio-level variables not nominal bcs Nominal categories don’t increase or decrease, instead, we talk about differences in proportions or probabilities
Positive (Direct): As the IV increases, the DV also increases (e.g., education level ↑ → political interest ↑).
Negative (Inverse): As the IV increases, the DV decreases (e.g., income ↑ → likelihood to vote for a particular candidate ↓).
linear and non-linear (or curvilinear) relationship btwn variables
*pos and neg are often linear relationships, where:
-The change in DV is consistent across values of the IV.
-Only meaningful when the IV is interval (equal unit spacing matters).
*Non-linear Relationships: Curvilinear
These don’t follow a straight line and suggest that the effect of the IV on the DV depends on the value or range of the IV.
Common shapes:
-U-shaped: Very low and very high values of the IV correspond to high DV values (e.g., political activism is high among both low- and high-income earners).
-Inverted U: Middle values of IV show higher DV than extremes.
-Example: A V- or U-shape may show that moderate income earners are less supportive of a policy compared to both low and high-income groups.
what are the different graph representing the relationships btwn varibales
- Bar charts:
-Similar to W1 but vertical axis does not represent percentage of cases falling into each value of the independent variable
-When the independent variable is nominal
Also works for ordinal variables if you don’t want to show trends — just comparisons.
2.Line charts:
-higher ink-ration: The proportion of “ink” (or visual elements) that actually represent real data
-> Higher data-ink ratio = less clutter, more clarity: A good graph communicates more with less
-Often used for time series data, or when showing trends or comparisons across ordered categories SO when the independent variable (IV) is ordinal or interval (not for nominal)
-Great for showing mean comparisons across ordered groups
can the median can be equal to Q1 or Q3
Yes, but it’s unusual.
It implies that at least 25% of the data fall at a single point. This often happens in heavily skewed or clustered distributions.
For example: a median = Q1 suggests that the bottom 50% of values are tightly packed or identical, while the top 50% vary more.
Should the median line in a box plot split the box into two equal parts?
Ideally, yes—if the distribution is symmetric.
But in practice, the position of the median line reflects the skewness of the data:
If the median is closer to Q1, it’s right-skewed (the right whisker (from Q3 to max) is longer than the left whisker)
If the median is closer to Q3, it’s left-skewed
def resarch hypotheses vS null hypotheses
A testable statement about the empirical relationship between cause (IV)
and effect (DV)
Asserts there is no relationship between the independent and dependent
variable
- each hypothesis has a corresponding null hypothesis (either stated
explicitly or implied