Flashcards in L4 Deck (27)
What is a panel data set?
A panel data set contains observations on multiple entities where each entity is observed at 2/more periods in time
What is a balanced panel?
No missing observations (variables observed for all entities and all time periods)
Explain why, in connection with OVB, PDSs are useful?
If there are some factors that affect Y but are not included in the dataset (ie. OVs) then they may likely vary between individuals, but will not likely vary across time tf any changes in Y cannot be caused by the omitted variable!
Example: traffic deaths and alcohol taxes (important to read and understand, don't think need to learn it by heart) (make sure I read it!!!)
How might an unexpected relationship be explained?
OVB in the model
If you have an omitted variable Z that cannot be controlled for (ie. is not observed) what is a solution to this?
Compare between time periods on the same entity since the OV should not change through time
(ie. any change in Y between periods cannot be caused by Z since Z has remained constant from one period to the other)
Show, mathematically, and explain that comparing between time periods on the same entity should solve the OVB issue?
See bottom of notes side 1 (finish)
Using the 'difference' equation works for 2 time periods, but how do we create 'Fixed Effects' regressions for more than 2 time periods?
1) 'n-1' binary regressor model
2) 'Fixed effects' regressor model
Draw and explain the diagram for the fixed effects regressor model?
the intercept, given by say α(i)=(β0+β2Z(i)) is different for each population (ie. each US state) but the slope, given by β1X(i) is the same tf lines are parallel
That I can formulate the n-1 binary regressor model - important!!!
That I can formulate a fixed effect regressor model! also important!!!
What is α in the fixed effects model in the example?
it is the 'state fixed effect' or 'state effect' (In the states example) (ie. it is the constant fixed effect of being in state i!)
What piece of information allows us to turn the fixed effects model into the n-1 binary regressor model?
The fact that shifts in the intercept can be represented using binary regressors (see diagram)
How do we estimate these models?
1) 'n-1 binary regressors' OLS formulation (only works if n isn't to big or end up with too many binary variables)
2) 'Changes' specification without an intercept (only works if T=2)
3) Entity-demeaned OLS regression
How would one carry out the n-1 BRs OLS regression?
create binary variables
estimate by OLS
(inference (ie. hypothesis tests etc.))
For entity-demeaned OLS (see notes again to familiarise myself with it!!!) how is Y(squiggle)(it) variable interpreted, using 'US states' as an example? (where Y is the fatality rate)
For i=1 and t=1982, Y(squiggle)(it) is the difference between the fatality rate in 'state 1' in 1982, and the average value of the fatality rate in that state across the time period studied
ie. For entity i, it is the difference between Y in year t and average Y across all t=1,...,T
Thing to remember when calculating SEs for panel dataset?
May need to use a different formula to reflect the different nature of the data!
When might we need regression with TIME-fixed effects? (states example) What do these lead to?
When an omitted variable varies across time but not across states; eg. safer cars, air bags, changes in national laws etc.
Lead to intercepts that vary over time!
What does entity demeaned OLS do?
It eliminates α(i) from the regression
How is the standard population regression model edited to describe:
a) OVB between states
b) OVB within states (ie. within states variables that change across time)
Explain what an entity fixed-effects model is?
It is one that considers that across different time periods, different entities will have fixed characteristics that apply only to them.
(eg. students taken a class test each year; fixed effects are intelligence, performance under pressure etc.)
(Note: a time-fixed effects model would then be required if there was a variable that varied across time but not between students, such as the teacher each year!)
What is a direct way to estimate a Panel Data Model/
Feasible generalised least squares!
All the formulations of different models in notes, v important
Why do we use clustered standard errors in fixed-effects models?
Because observations for the same entity are NOT independent (ie. there is serial correlation within-entities (ie. across time))
Is the OLS fixed estimator beta(1) still normally distributed?
Explain why time-series variables are often auto/serially correlated? What is the implication of this
When a variable Q is observed at different dates, for example if Q is yearly weather, you often find that Q will be correlated through time because say, cold winters come in clusters, or hot years come in clusters, therefore each observation Q(t) will not be independent of each other tf because u(it) is autocorrelated, OLS SEs are wrong