Quantitative methods Flashcards
Multiple regression model assumptions
- linearity,
- homoskedasticity –> variance of residuals constant
- independence of errors, –> Residuals are not serially correlated
- normality, –> error term is normally distributed evaluated with QQ plot
- independence of independent variables.–> no linear relationships between independent variables
MSR
MSR = RSS/k
MSE
MSE = SSE/(n−k−1)
SST
RSS+SSE
R2
RSS/SST
oppure
(SST-SSE)/SST
oppure
(total variation – unexplained variation )/total variation
indica quanto l’indipendent variable puo spiegare
Breusch pagan
n*R^2
Adjusted R2
1-((n-1)/(n-k-1))*(1-R^2)
o measure of goodness of fit that adjusts for the number of independent variables
o adj R2<R2
o decreases when the added independent variable adds little value to regression model
Cook’s D
If observation > √(k/n)–> influential point
Odds
Prob given odds
Odds= e^coefficient
Prob with odds = odds/(1+odds)
F statistic
((SSEr-SSEu)/q) / (SSEu/(n-k-1))
=MSR/MSE with K and N-K-1 df
H0 all coefficients are zero
reject H0 if F (test-statistic) > Fc (critical value)
to explain whether at least one coefficient is significant
Conditional Heteroskedasticity
Residual variance is related to level of independent variables
- Coefficients consistent.
- St. errors underestimated
- Type I errors
DETECTION
* Breusch–Pagan chi-square test
* >5% hetero
* <5% no hetero
CORRECTIOn
robust or White-corrected standard errors
Serial Correlation
Residuals are correlated with each other
- Coefficients consistent
- St errors underestimated
- Type I errors (positive correlation)
DETECTION
* Breusch–Godfrey (BG) F-test
* Durbin Watson (DW)
* DW<2–> pos. serial corre.
CORRECtION
Use robust or Newey–West corrected standard errors
Multicollinearity
Two or more independent variables are highly correlated
- Coefficients are consistent (but unreliable).
- St errors are overestimated
- Type II errors
DETECTION
* Conflicting t and F-statistics
* variance inflation factors (VIF)
* VIF >5 o 10 problema
CORRECTION
* Drop 1 of the correl. variables
* use a different proxy for an included independent variable
MISSPECIFICATIONS
Omission of important independent variable(s)–>May lead to serial correlation or heteroskedasticity in the residuals
Inappropriate transformation / variable form–> May lead to heteroskedasticity in the residuals
Inappropriate scaling–>May lead to heteroskedasticity in the residuals or multicollinearity
Data improperly pooled
Solve it by running regression for multiple periods–May lead to heteroskedasticity or serial correlation in the residuals
prob with odds
P=(odds)/(1+odds)
Autoregressive (AR) Model
- only 1 lag–>dependent variable is regressed against previous values of itself
- no distinction between the dependent and independent variables (i.e., x is the only variable).
- USE t-test to determine whether any of the correlations between residuals at any lag are statistically significant.
- if not covariance stationary To correct add one lag at a time–> first differencing
- Ex: pattern of currency using historical price
add one lag at a time - Chain rule forecasting
Covariance Stationary
- Statistic significant = cov stationary
o Constant and finite mean. E(xt) = E(xt-1) ATTENZIONE no growth rate della mean
o Constant and finite variance.
o Constant and finite covariance - determine cov. StationaryDickey-Fuller test
Mean Reversion
A time series is mean reverting if it tends towards its mean over tim
=b0/(1-b1)
Se b1 =1–> mean reverting è undefined perchè b0/0
Unit Root = Random walk
- B1=1 devo first differencing I dati
Undefined mean rev. level–>Not covariance stationary
Random Walk
- random walk = value in one period is equal to the value in another period, plus a random error.
- Random walk without a drift: xt = xt−1 + εt b0=0 and b1=1
- Random walk with a drift (con b0): xt = b0 + xt−1 + εt b1=1
Seasonality
- More than 1 lag
o quarterly data = seasonal lag is 4;
o monthly data = seasonal lag is 12.
Root Mean Squared Error (RMSE)
to assess accuracy of autoregressive models.
* lower RMSE = better
* Out-of-sample forecasts
- structural change
significant shift in the plotted data at a point in time that seems to divide the data into two distinct patterns
- Cointegration:
two time series are economically linked (same macro variables) or follow the same trend and that relationship is not expected to change