Midterm Flashcards

learn (122 cards)

1
Q

Correlation

A

The correlation between two features of the world is the extent to which they tend to occur together.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Positively correlated:

A

When higher (lower) values of one variable tend to occur with higher (lower) values of another variable, we say that the two variables are positively correlated .

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Negatively correlated

A

When higher (lower) values of one variable tend to occur with lower (higher) values of another variable, we say that the two variables are negatively correlated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Uncorrelated:

A

When there is no correlation between two variables, meaning that higher (lower) values of one variable do not systematically coincide with higher or lower values of the other variable, we say that they are uncorrelated .

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Line of best fit:

A

A line that minimizes how far data points are from the line on average, according to some measure of distance from data to the line

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Mean ( μ)

A

The average value of a variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Deviation from the mean:

A

the distance between an observation’s value for some variable and the mean of that variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Variance ( σ2)

A

A measure of how variable a variable is . It is the average of the square of the deviations from the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Standard deviation ( σ)

A

Another measure of how variable a variable is. The standard deviation is the square root of the variance .t has the advantage of being measured on the same scale as the variable itself and roughly corresponds to how far the typical observation is from the mean (though, like the variance, it puts more weight on observations far from the mean)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Covariance (cov)

A

measure of the correlation between two variables. It is calculated as the average of the product of the deviations from the mean

Covariance is a measure of how much two variables change together. It indicates whether an increase in one variable corresponds to an increase or decrease in another

However, it is not standardized and depends on the units of the variables involved, making it hard to interpret the strength of the relationship.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Correlation coefficient (r)

A

Another measure of the correlation between two variables. It is calculated as the covariance divided by the product of the variances . The correlation coefficient takes a value between − 1 and 1, with − 1 reflecting perfect linear negative dependence, 0 reflecting no correlation, and 1 reflecting perfect linear dependence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

The square of the correlation coefficient (r^2)

A

It takes values between 0 and 1 and is often interpreted as the proportion of the variation in one variable explained by the other variable. But we have to pay careful attention to what we mean by “explained. ” Importantly, it doesn’t mean that variation in one variable causes variation in the other

It provides an indication of how well the regression equation fits the data. A higher r^2 indicates a better fit, suggesting that the independent variable is successful in explaining or predicting variations in the dependent variable.

Strength of Relationship: The closer
r^2 is to 1, the stronger the linear relationship between the two variables. If r^2 is close to 0, it suggests a weak or no linear relationship.

Interpretation: For example, if
r^2 is 0.75, it means that 75% of the variance in the dependent variable is explained by the independent variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

sum of squared errors

A

The sum of the square of the distance from each data point to a given line of best fit . This gives us one way of measuring how well the line fits/describes/explains the data .

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

OLS regression Line

A

The line that best fits the data, where best fits means that it minimizes the sum of squared error .

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Slope of the regression line or regression coefficient:

A

The slope of the regression line describes how the value of one variable changes, on average, when the other variable changes . The slope of the regression line is the covariance of two variables divided by the variance of one of them, sometimes also called the regression coefficient .

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Causal Effect

A

Informally, the change in some feature of the world that would result from a change to some other feature of the world. Formally, the difference in the potential outcomes for some unit under two different treatment statuses

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Counterfactual comparison

A

A comparison of things in two different worlds or states of affairs, at least one of which does not actually exist.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Treatment

A

Terminology we use to describe any intervention in the world. We usually use this terminology when we are thinking about the causal effect of the treatment, so we want to know what happens with and without the treatment. Importantly, although it sounds like medical terminology, treatment as we use it can refer to anything that happens in the world that might have an effect on something else

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Potential outcomes framework

A

framework for representing counterfactual EQUATION:

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Potential Outcome

A

The potential outcome for some unit under some treatment status is the outcome that unit would experience under that (possibly counterfactual) treatment status

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Fundamental problem of causal inference:

A

This refers to the fact that, since we only observe any given unit in one treatment status at any one time, we can never directly observe the causal effect of a treatment.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Heterogeneous treatment effects

A

When the effect of a treatment is not the same for every unit of observation (as in the case of flu shots and virtually every other interesting example of a causal relationship), we say that the treatment effects are heterogeneous. Sometimes we’re still interested in the average effect even though we know the treatment effects are heterogeneous, and sometimes we want to explicitly study the nature of the heterogeneity. (In contrast, when discussing the unlikely possibility that treatment effects are the same for every unit, we would refer to homogeneous treatment effects.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Selecting on the dependent variable

A

Examining only instances when the phenomenon of interest occurred, rather than comparing cases where it occurred to cases where it did not occur.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Dependent variable:

A

The variable associated with the outcome we are trying to describe, predict, or explain.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Independent or Explanatory variable:
A variable we are using to try to describe, predict, or explain the dependent variable.
26
Regression equation:
Regression equation: An equation linearly relating a dependent variable to some independent variables.
27
Regression parameters
Regression parameters: The parameters (intercept and slopes) that relate a dependent variable to some independent variables in a regression equation. Alpha is the intercept, Beta is the slope
28
Error
Error: The difference between the value of the outcome variable for an individual data point and the predicted value for that same data point. This is sometimes also referred to as the residual.
29
Sum of squared errors (SSE):
For a given line, calculate the error for each data point by finding its vertical distance from the line. The sum of squared errors for that line is found by squaring each of the individual errors and adding them together.
30
Ordinary least squares (OLS) regression:
The method for finding the line of best fit through data that minimizes the sum of squared errors.
31
Regression line
The line of best fit through the data that one gets from OLS regression.
32
Intercept (alpha)
Intercept: In the context of a regression, the intercept tells us the predicted value of the outcome when the values of all the explanatory variables are set to 0. This is also referred to as the constant term. Sometimes the intercept has a substantive interpretation, but sometimes it doesn’t because it doesn’t make sense to think about situations where all the explanatory variables are zero (for example, predicted voter turnout for people with an age of zero). In any case, we always include the intercept when we run a regression (except in very unusual circumstances where we know from theory that the intercept should be zero).
33
Conditional mean function:
Conditional mean function: A function that tells you the mean (average) of some variable conditional on the value of some other variables.
34
Out-of-sample prediction:
Using regression (or another statistical technique) to predict the outcome for observations that were not included in the original data you used to generate your predictions.
35
Overfitting
Attempting to predict a dependent variable with too many independent variables, so that variables appear to predict the dependent variable in the data but have no actual relationship with it in the world.
36
Estimand
The unobserved quantity we are trying to learn about with our data analysis.
37
Estimator
The procedure applied to data to generate a numerical result
38
Estimate
The numerical result arising from the application of our estimator to a specific set of data.
39
Bias
Differences between our estimand and our estimate that arise for systematic reasons—that is, for reasons that will persist on average over many different samples of data.
40
Noise
Differences between our estimand and our estimate that arise due to idiosyncratic facts about our sample.
41
Unbiasedness:
An estimate/estimator is unbiased if by repeating our estimation procedure over and over again an infinite number of times, the average value of our estimates would equal the estimand.
42
Expectation or Expected value:
Expectation or Expected value: The average value of an infinite number of draws of a variable is the variable’s expected value or its value in expectation.
43
Precision
An estimate/estimator is precise if by repeating our estimation procedure over and over again, the various estimates would be close to each other. The more similar the hypothetical estimates from repeating the estimator, the more precise the estimate.
44
Sampling Distribution
The distribution of estimates that we would get if we repeated our estimator an infinite number of times, each time with a new sample of data.
45
Standard Error
The standard deviation of the sampling distribution. If the estimator is unbiased, the standard error gives us a sense of how far, on average, our estimate would be from the estimand if we repeated our procedure over and over with independent samples of data.
46
Margin of Error
Pollsters often multiply the standard error by 2 and report this as the margin of error. For example, if a survey reports a mean value of 50 with a margin of error of ±3 at a 95% confidence level, it means that we can be 95% confident that the true population mean falls within the range of 47 to 53. | For example, if a survey reports a mean value of 50 with a margin of err
47
95% confidence interval:
If we applied the estimator an infinite number of times, each time on a new sample of data, the estimand would be contained in the 95% confidence interval (newly calculated each time) 95 percent of the time. Importantly, it is not true that we are 95 percent confident that the true estimand lies in the 95% confidence interval.
48
Hypothesis Testing
Statistical techniques for assessing how confident we should be that some feature of the data reflects a real feature of the world rather than arising from noise.
49
Null Hypothesis
The hypothesis that some feature of the data is entirely the result of noise.
50
Statistical Significance
We say that we have statistically significant evidence for some hypothesis when we can reject the null hypothesis at some pre-specified level of confidence (typically, 95% confidence).
51
p-value
The probability of finding a relationship as strong as or stronger than the relationship found in the data if the null hypothesis is true. We use p-values to assess statistical significance. For instance, if the p-value is less than.05, then we have statistically significant evidence (at the 95% confidence level) that the relationship is real. Importantly, the p-value is not equal to the probability that the null hypothesis is true.
52
Publication Bias
The phenomenon whereby published results are systematically over-estimates because there is a bias toward publishing statistically significant results.
53
p-hacking:
Searching over lots of different ways to run an experiment, make a comparison, or specify a statistical model until you find one that yields a statistically significant result and then only reporting that one.
54
p-screening:
A social process whereby a community of researchers, through its publication standards, screens out studies with p-values above some threshold, giving rise to publication bias.
55
Hawthorne effect
The phenomenon whereby subjects change their behavior because they know they are being studied.
56
Demand Effect
A specific instance of a Hawthorne effect in which research subjects change their behavior to try to please the researcher.
57
Signal
The systematic component of an outcome that is persistent across observations. For example: * genes of a plant is signal * amoung of sun it gets is noise *skill of a golf player is signal * luck in a particular
58
Noise
Noise: Random components of an outcome that change from observation to observation.
59
Reversion to the mean
The phenomenon whereby, if one observation of an outcome made up of signal and noise is particularly large (respectively, small) other observations will typically be smaller (respectively, larger).
60
controlling
Using a statistical technique to find the correlation between two variables, holding the value of other variables constant.
61
Dummy Variable
A variable that indicates whether a given unit has some particular characteristic, taking a value of 1 if the unit has that characteristic and 0 if the unit does not.
62
Dependent or Outcome variable:
The variable in your data corresponding to the feature of the world that you are trying to understand or explain with your regression.
63
Treatment Variable
The variable in your data corresponding to the feature of the world whose effect on the dependent variable you are trying to estimate.
64
Control Variable
A variable in your data that you include in your statistical analysis in an attempt to reduce bias in your estimate of a causal effect
65
Omitted variables bias
The bias resulting from failing to control for some confounder when attempting to estimate a causal effect.
66
Local average treatment effect (LATE):
The average treatment effect for some specific subset of the population.
67
Blocked/Stratified Random Assignment
The process of dividing experimental subjects into different groups (typically groups that you believe have similar potential outcomes) and then randomizing your treatment within each of those groups. This can significantly improve the precision of your estimates. If the probability of treatment varies across blocks or strata, you will have to account for this (e.g., by controlling for block-fixed effects) in order to obtain unbiased estimates.
68
Noncompliance
When an experimental subject chooses a treatment status other than the one to which it was assigned
69
Compliers
Units that take up the treatment status they are assigned
70
Always Takers
Units that are always treated, regardless of whether they are assigned to be treated or untreated.
71
Never Takers
Units that are never treated, regardless of whether they are assigned to be treated or untreated.
72
Defiers
Units that take up the opposite of the treatment status they are assigned
73
Intent-to-treat (ITT) or reduced-form effect
The average effect on the outcome of being assigned to the treated rather than the untreated group. This need not be the average treatment effect because of noncompliance.
74
First Stage effect
The average effect of being assigned to the treated group on take-up of the treatment. This corresponds to the fraction of compliers
75
Complier average treatment effect (CATE):
The average treatment effect for the compliers—a special kind of LATE.
76
Instrumental variables (IV):
nstrumental variables (IV): A set of procedures for estimating the CATE in the presence of noncompliance. The Wald Estimator is a special case of instrumental variables. All IV designs require that we can credibly estimate the effect of the instrument on the treatment and on the outcome (exogeneity), that the instrument affects the treatment (compliers), that the instrument only affects the outcome through its effect on the treatment (exclusion restriction), and that there is not a large number units who take-up treatment if and only if the instrument assigns them to the untreated group (defiers).
77
Exogeneity:
An instrument is exogenous if it is randomly assigned or “as if” randomly assigned such that we can get an unbiased estimate of both the first-stage and reduced-form effects.
78
Exclusion restriction:
An instrument satisfies the exclusion restriction if it affects the outcome only through its effect on the treatment, not through any other channel.
79
Chance imbalance:
The situation where, despite random assignment, the treated and untreated groups differ in important ways because of noise
80
Statistical power:
The statistical power of a study is technically defined as the probability of rejecting the null hypothesis of no effect if the true effect is of a certain non-zero magnitude. Colloquially, we say that a study has low statistical power if it was unlikely to produce a statistically significant result even if the effect being investigated is large. it is the likelihood that the study will detect an existing effect if it's present.
81
Attrition:
The situation where experimental subjects drop out of the experiment, such that you do not observe outcomes for those subjects. Attrition is different from noncompliance.
82
Interference:
The situation where the treatment status of one unit affects the outcome of another unit.
83
Natural experiment:
When something was randomized not for research purposes, but careful analysts are nevertheless able to utilize this randomization to answer an interesting causal question.
84
Running variable:
A variable for which units’ treatment status is determined by whether their value of that variable is on one or the other side of some threshold.
85
Regression discontinuity design
A research design for estimating a causal effect that estimates the discontinuous jump in an outcome on either side of a threshold that determines treatment assignment.
86
Continuity at the threshold
: The requirement that average potential outcomes do not change discontinuously at the threshold that determines treatment assignment. If continuity at the threshold doesn’t hold, then a regression discontinuity design does not provide an unbiased estimate of the local average treatment effect.
87
Sharp RD:
An RD design in which treatment assignment is fully determined by which side of the threshold the running variable is on.
88
Fuzzy RD:
A research design that combines RD and IV. The fuzzy RD is used when treatment assignment is only partially determined by which side of the threshold the running variable is on. The researcher, therefore, uses which side of the threshold the running variable is on as an instrument for treatment assignment. In this setting, continuity at the threshold guarantees that the exogeneity assumption of IV is satisfied. But we still have to worry about the exclusion restriction and the other IV assumptions.
89
Difference-in-differences:
A research design for estimating causal effects when some units change treatment status over time but others do not.
90
Parallel trends:
The condition that average potential outcomes without treatment follow the same trend in the units that do and do not change treatment status. This says that average outcomes would have followed the same trend had it not been for some unit’s changing treatment status. If parallel trends doesn’t hold, difference-in-differences does not provide an unbiased estimate of the ATT.
91
First differences:
A statistical procedure for implementing difference-in-differences. It involves regressing the change in outcome for each unit on the change in treatment for each unit.
92
Wide format:
A way to structure a data set in which each unit is observed multiple times, where each row corresponds to a unique unit.
93
Long format:
A way to structure a data set in which each unit is observed multiple times, where there is a row for each unit in each time period.
94
Fixed effects regression:
A statistical procedure for implementing difference-in-differences. It involves regressing the outcome on the treatment while also including dummy variables (fixed effects) for each time period and for each unit.
95
Pre-trends:
The trend in average outcomes before any unit changes treatment status. If pre-trends are not parallel, it is harder to make the case that the parallel trends condition is plausible.
96
Lead treatment variable:
A dummy variable indicating that treatment status in a unit will change in the next time period.
97
Mediator:
A feature of the world that is affected by the treatment and affects the outcome.
98
Causal mediation analysis:
Techniques for trying to estimate how much of the effect of a treatment on an outcome is the result of the treatment’s effect on a mediator and the mediator’s effect on the outcome.
99
Percentage point change:
The simple numerical difference between two percentages.
100
Percent change:
A way of measuring the degree of change. It is the difference between the initial value and the new value divided by the original value (multiplied by 100). Unlike percentage point change, percent change is highly sensitive to the original value.
101
Conditional probability:
The probability of an event conditional on some other information. We write the probability of C conditional on E as Pr(C|E).
102
Prior belief:
Your belief about some thing before learning new evidence.
103
Posterior belief:
Your belief about some thing after incorporating new evidence.
104
Bayes’ rule:
A formula for calculating your posterior belief conditional on new evidence and your prior belief.
105
Statistical power:
The probability of finding a statistically significant result in the data given that the relationship really exists in the world.
106
Internal validity
: An estimate is internally valid if it is a credible estimate of the estimand (e.g., the estimator is unbiased).
107
External validity:
An estimate is externally valid if there is good reason to think the relationship will hold in a context other than the one from which the data is drawn.
108
Strategic adaptation:
Changes in behavior that result from an attempt to avoid the effects of a change in someone else’s behavior. efforts to imporve outcomes on some dimesntions lead people to adjust their behavior to get around those efforts.
109
Selected sample:
A sample of data that wasn’t drawn at random from the population of interest but rather was selected to be studied because it possessed some particular set of characteristics.
110
Causal effect:
The change in some feature of the world that would result from a change to some other feature of the world.
111
Average Treatment Effect (ATE):
The difference in average outcome comparing two counterfactual scenarios—one where everyone in the population is treated and one where everyone in the population is untreated.
112
Average Treatment Effect on the Treated (ATT):
The difference in average outcome comparing the scenario where everyone in the subgroup of people who in fact received treatment is treated and the counterfactual scenario where everyone in that subgroup is untreated.
113
Average Treatment Effect on the Untreated (ATU):
The difference in average outcome comparing the counterfactual scenario where everyone in the subgroup of people who did not receive treatment is treated and the scenario where everyone in that subgroup is untreated.
114
Difference in means:
The difference in average outcome comparing the subgroup of people who in fact received treatment to the subgroup of people who in fact did not receive treatment.
115
Baseline differences:
Differences in the average potential outcome between two groups (e.g., the treated and untreated groups), even when those two groups have the same treatment status.
116
Confounder:
A feature of the world that (1) has an effect on treatment status and (2) has an effect on the potential outcome over and above the effect it has through its effect on treatment status.
117
Reverse causality:
When the outcome affects treatment status.
118
Over-estimate:
When the bias is positive, so that the estimate is larger than the true effect in expectation.
119
Under-estimate:
When the bias is negative, so that the estimate is smaller than the true effect in expectation.
120
Mechanism (or mediator):
A feature of the world that the treatment affects, which then, in turn, affects the outcome
121
Pre-treatment covariate:
A variable that is correlated with treatment and outcome before the treatment occurs.
122
whos a bad bitch
alex szmyd