Theory Cards: Econometrics, Statistics, Causal Inference Flashcards
(47 cards)
What is the purpose of A/B testing in experiments?
A/B Testing
A/B testing aims to determine if changes to a variable (e.g., a webpage design) lead to a statistically significant difference in a key metric by comparing control and treatment groups.
What is a null hypothesis in A/B testing?
A/B Testing
The null hypothesis states that there is no effect or difference between the control and treatment groups.
What is a p-value in hypothesis testing?
A/B Testing
A p-value represents the probability of observing results as extreme as those in the experiment if the null hypothesis is true. If p < 0.05, results are considered statistically significant.
What is a Type I error in A/B testing?
A/B Testing
A Type I error (false positive) occurs when the null hypothesis is incorrectly rejected, suggesting an effect exists when it doesn’t.
What is a Type II error in A/B testing?
A/B testing
A Type II error (false negative) happens when the null hypothesis is not rejected despite there being an actual effect, failing to detect a real difference.
What is power analysis in the context of A/B testing?
A/B testing
Power analysis calculates the probability of correctly rejecting a false null hypothesis, reducing the chance of Type II errors and increasing confidence in detecting meaningful effects.
Why is sample size important in A/B testing?
A/B testing
A sufficient sample size is needed to detect a statistically significant difference. It depends on the expected effect size, statistical power (usually 0.8), and significance level (often 0.05).
What is a multi-variant test?
A/B testing
A multi-variant test compares more than two variations simultaneously, rather than just a control and treatment.
Name a common pitfall in A/B testing and its solution.
A/B testing
Sample contamination (when participants in one group are influenced by another) can skew results. One solution is to strictly separate groups or adjust analysis methods.
What statistical test would you use for comparing means in A/B testing?
A/B testing
A t-test is commonly used to compare the means of two samples, such as conversion rates or average order values, assuming normally distributed data and equal variances.
When should T-Test be used?
A/B testing
For comparing the means for 2 samples (conversion rates or avg order values)
Assumes normally distributed data and equal variances in both groups
Example: comparing avg revenue per user between a control and a treatment group
Example: comparing avg time spent on a platform between two user groups: those who made a purchase and those that didnt
When Should Chi-Square Test be used?
A/B testing
For categorical data (e.g. converted vs not converted) between 2 groups
Assumes that each observation is independent and expected frequencies in each cell are adequate
Example: checking if there is an associasion between a user gender and whether they purchased a product
When should Z-Test be used?
A/B testing
Similar to t-test but specifically for large samples or when the population variance/mean is known
Assumes normally distributed data with large sample sizes
Example: comparing click-through rates when you have very large sample
Example: If historial avg # of listings per user is known, we can compare it to the avg # of listings per user from a new sample when a new feature is introduced, to assess whether the difference is statistically significant
When should non-parametric tests be used?
A/B testing
When the data doesnt meet the assumptions of normality (e.g. revenue data often has outliers and is not normally distributed)
Example: For skewed data, like transaction amounts between 2 groups
When should Bayesian Test be used?
A/B testing
To estimate the probability of one version being better than the other. Provides probability rather than a p-values
Example: estimate how likely the new version is to be better by a specific margin.
What is the difference between correlation and causation?
Causal Inference
Correlation is when two variables move together but don’t imply one causes the other. Causation implies a direct effect of one variable on another.
What is a confounding variable in causal inference?
Causal Inference
confounding variable is an external factor that influences both variables studied, potentially creating a false impression of causation.
What are counterfactuals in causal inference?
Causal Inference
Counterfactuals represent “what could have happened” under a different scenario, such as considering if a recovery would still happen without taking medicine.
What is the Difference-in-Differences method used for?
Causal Inference
It’s used in policy impact analysis where randomized trials are not feasible, comparing changes over time between a treatment and control group.
What is an instrumental variable (IV) in causal inference?
Causal Inference
An IV is a variable affecting the treatment but not directly influencing the outcome except through the treatment, used to address unobserved confounding.
Example: Imagine studying the effect of education on income, but family background (unobserved) affects both. IF we use “distance to the nearest college” as an instrument, it affects the likelyhood of attending a college (treatment) but likely doesn’t directly influence income.
Quasi-Experimental Methods:
How does Propensity Score Matching (PSM) work?
Causal Inference
PSM matches individuals in treatment and control groups based on observed characteristics to simulate randomization, though it only controls for observed variables.
Example: if we want to study the impact of a job training program, we could match participants(treatment) with non-participants(control) based on characteristics like age, education, and prior job experience.
Limitation: PSM can only control for observed variables.
Quasi-Experimental Methods:
What is the use case of Quasi-Experimental Methods? What are some of these methods?
Causal Inference
These methods help us make causal inferences when randomized experiments arent feasible
e.g.
- Diffrence in difference
- Instrumental Variables
- Propensity score matching
- Regression Discontinuity Design (RDD)
What is Regression Discontinuity Design (RDD)?
Causal Inference
Used when treatment is assigned based on a cutoff score (e.g. test scores, age)
Compares individuals just above and just below the cutoff, assuming they are otherwise similar.
Example: suppose scholarships are given only to students with GPA >3.0. RDD would compare students just above (eligible) and just below (ineligible) the cutoff to estimate the effect of scholarships on academic success.
Requires a clear cutoff.
What are the key assumptions of regression analysis?
Econometrics
- Linearity: linear relationship between predictors and outcome variable
- Independence: observations are independent of each other
- Homoscedasticity: Constant variance of residuals across values of independent variables
- No multicollinearity: independent variables arent highly correlated