Causal inference Flashcards

(19 cards)

1
Q

whats the difference between causal discovery and causal inference?

A

Causal discovery; What factors (X) cause a specific outcome (Y)?. Identify underlying causal structures from data

Causal inference: How much does treatment (X) affect outcome (Y)?. Determining how much specific factors affect an outcome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what are causal estimands calculated in causal inference?

A

-Average treatment effect (ATE): average effect of a treatment across an entire population
-Conditional average treatment effect (CATE): average effect of a treatment for a specific subgroup within the population
-Average treatment effect on the treated (ATT): average effect of a treatment on those who were treated
-Individual treatment effect (ITE): the effect of a treatment on an individual

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

pros/cons/assumptions of RCTs

A

pros: gold standard due to random assignment

cons: can be expensive, time consuming, not feasible or ethically wrong

limitations: if the sample is a specific narrowly defined population it might not translate to real word applicants
-possible bias: selection bias, performance bias (unequal adherence), detection bias (differential assessment methods)

assumptions: participants are randomly assigned, no confounders, participants represent general/target population

effect estimator: mean diff between groups (ATE)
test: t-test (continuous), chi square test (binary), gives p-value if < 0.05, statistically significant, reject null hypothesis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what non experimental methods can be used to test causal inference?

A

-propensity score matching: pair individuals who received the treatment with those who didn’t based on similar characteristics, reduces bias from confounders

-Instrumental variables: uses variables related to the treatment but not directly to the outcome to isolate the causal effect by providing a source of variation in the treatment that isn’t influenced by confounders

-double machine learning: uses machine learning to control for confounders by creating ‘clean’ versions of treatment and outcome variables, by subtracting estimated effects from confounders

-difference in differences: compares changes in outcomes over time between a treatment and control group, helps to control for confounders that effect both groups similarly over time

-bayesian structural time series: use bayesian methods to model time series data. the effect of intervention is inferred by comparing observed data with a counterfactual scenario to estimate what would have happened without intervention

-regression discontinuity design: uses a cutoff or threshold to assign treatment. compare individuals just above and below the cutoff to estimate affect by assuming they are similar

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what assumptions are required to test the validity of an A/B test design framework?

A

-participants are randomly assigned to either group
-no selection bias & group representative of population
-independent observations: the outcome of one unit doesnt affect the outcome of another
-no external changes during the test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what method to use when there is a pre-existing difference between groups? (experimental data)

A

controlled-experiment using pre-experiment data (CUPED).

Make predictions of the post-experiment data using pre-experiment data to estimate the baseline for an individual without treatment.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what does it mean when an experiment is underpowered? how to solve?

A

An experiment is underpowered when the treatment effect is too small relative to the metric’s variance for a given sample size (ie variance in data is too large to detect an effect even if it actually is significant)

CUPED (Controlled-experiment Using Pre-Experiment Data) tries to remove variance in a metric that can be accounted for by pre-experiment information - variance that pre-experiment data can explain is a metric unrelated to the effects of the experiment and therefore can be removed

reducing variance helps to increase power to detect small effects

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what method to use when some people in the treatment group do not actually receive the treatment? (experimental data)

A

complier average causal effect (CACE) - average causal affect of compliers

-adjusts the intention to treat effect with the compliance rate in order to estimate the treatment effect for the subpopulation that is actually being treated

-estimated using an instrumental variable approach where the instrument is the group assignment and the actual treatment received is the outcome (influenced only by the instrument)

-CACE = intent to treat effect (effect of assignment on outcome) / effect of assignment on treatment (portion of compliers)

-assumption: treatment assignment affects outcome only through treatment being received

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

when do you use propensity score matching? what is it?

A

propensity score is the conditional probability of someone being assigned to a treatment given a set of covariates.

-estimate propensity score using logistic regression

-units in the treatment group are then matched with units in the control group with similar propensity scores
-select control and treated units with similar characteristics. reduces selection bias by creating balanced treatment and control groups

assumptions:
-treatment is independent of outcome given covariates (ie all confounders are observed and included)
-positive probability of receiving treatment and control for every value of the covariates

limitations:
-can’t control for unobserved variables/confounders
-matching can result in discarding unmatched units, reducing sample size and statistical power
-quality of matching depends on the correct propensity score, poor model choice or omission of variables can lead to biased estimates
-doesn’t work well when treated and control groups are too different (lack of common support)

-use when: randomisation not possible, data is not observed over time, you have full confounder and have pre-intervention covariates - when data is rich in confounders and they can all be observed

-ATT
-same tests

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what is an alternative to regular propensity score matching?

A

inverse probability of treatment weighting (IPTW): instead of matching individuals, each observation is weighted based on the inverse of the probability that they received the treatment they actually received - estimates ATE instead of ATT

-same assumptions as PSM
-limitation: if units have propensity scores near 1 or 0 the resulting weights can cause high variance and unreliable estimates

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what is the regression discontinuity method? when to use?

A

the idea is that units just above and just below the cutoff are likely similar in all respects except for the treatment, so use the jump to estimate the causal effect
-controls for selection bias around the cutoff
-when assignment to treatment/control occurs around a threshold or cutoff

-assumptions:
-the potential outcome is smooth around the cutoff
-no sudden jumps in the outcome at treatment unless caused by the treatment
-the individuals can’t manipulate control to fall on either side of the cutoff
-units near the cutoff are similar and can be thought of as randomly assigned to treatment/control.

-limitations:
-estimates are only local to the cutoff point, can’t generalise to other parts of the variable
-need large sample near cutoff

-use when: data is not repeated over time, treatment assignment depends on a sharp cutoff

-local average treatment effect (only around the cutoff point)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what is interrupted time series design? and when to use?

A

evaluate causal impact by analysing the trends before and after treatment. estimate whether there is a change at the time the event takes place. use when treatment occurs at a known specific time

-model the outcome variable over time and look for either a jump in the outcome or change in trajectory

assumptions:
-no other events occur at same time that could affect outcome
-the outcome variable follows a steady trend before treatment
-errors are not correlated, or they are accounted for

limitations:
-need sufficient pre and post data
-no control group, making it harder to rule out confounders that changed at the same time

-measures: effect of treatment on the treated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what is the difference in differences approach? when to use?

A

estimates the causal effect of a treatment by comparing the before-and-after differences in outcomes between a treatment group and a control group. the difference of the differences isolates the effect of the treatment

assumptions:
-parallel trends: in the absence of treatment, the difference in outcomes between the treatment and control groups would have remained constant over time
-no confounders in only the treatment group that occurred at time of intervention
-individuals in each group are comparable over time

limitations:
-hard to verify parallel trends, but can check pre-treatment trends
-time varying confounders can still affect results -try adjust for these

use when: you have both control and treated group observed before and after intervention

measure: DiD estimator (ATT) (additional change in treatment group due to treatment)
-p-value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

how can you use instrumental variables to estimate causation? what scenarios to use this in?

A

use a variable/instrument that:
-affects the treatment
-affects the outcome only through the treatment (not directly)

-use the instrument to predict the treatment, then use the predicted treatment to estimate the outcome
-tries to isolate the causal effect

assumptions:
-instrument and causal variable are strongly correlated
-instrument affects the outcome only through the causal variable
-instrument is uncorrelated with other unobserved factors

limitations:
-if the instrument is weak (not strongly correlated with variable) then estimates are biased
-only estimates the local average treatment effect (only for compliers affected by the instrument)
-you need as many valid instruments as endogenous variables

use when: randomisation not possible, the treatment variable is correlated with unobserved confounders (you suspect endogeneity)
-you cant measure all confounders

-Local average treatment effect (treatment effect for compliers)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what is sensitivity analysis?

A

last resort when no instrument and unmeasured confounders,
-idea is to quantify how strong an unmeasured confounder would need to be to invalidate conclusion

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

when given a case study question/hypothesis that asks you to test a hypothesis around a cause (e.g. we think this is causing this, how would you test?), what steps should you go about to answer? Assume A/B test is possible/the answer

A
  1. understand the question: ensure you understand what the parts of it are saying. ask follow up if needed - clarify definitions
    -find out if change is slow over period of time or drastic (if drastic likely not due to a feature)
    -start with exploratory analyis e.g graph of what is happening
    -trying to sense check whether the hypothesis is correct
  2. validate hypothesis
  3. think about how to check problem, by test or analysis
  4. share recommendations
17
Q

whats the framework for designing an A/B test?
-use feature and number of orders as example

A
  1. Hypothesis formulation:
    -null hypothesis and alternate hypothesis
    -null: no change in number of orders with feature
    -alternate: some change
  2. network effect:
    -when running an experiment all experiment units should be independent of each other
    -can users influence each other?
    -identify if there is a network effect problem that needs to be considered
  3. randomisation unit
    -based on network effect
  4. power analysis - estimate the sample size
  5. length of experiment
  6. A/A test and basic sanity check
  7. experiment analysis
    -run A/B test, with one group using the feature
    -check if statistically significant change in order numbers between groups during experiment period
  8. recommendation
18
Q

how to do A/B test if feature is already released?

A

Ablation - reverse of a/b test. cease to show the feature to a group and use them as treatment group. test in same way

19
Q

how to do A/B test when results will take long time to show?

A

holdback test - give feature to most people so you can get gains from it sooner. but get small test and control group. compare holdback test and control over longer period of time