Chapter 2 - Tsay Flashcards
(58 cards)
forecasting vs predictions
Forecasting is in the future
Prediciton is about cross-sections
name an advantage of using linear, as opposed to non linear, models
The bias variance tradeoff makes it very unikely that data is being overfitted. Linear models are likely on the underfitting side of things.
The linear model is robust against small datasets. In order to make non linear models, we need to have very large datasets.
what is the random-walk type of forecasting?
Just guess the current value
what is the goal of univariate tiem series modeling?
We want to model the conditional expecation, E[y_t | F_{t-1}]
elaborate on seasonality
Seasonality refers to regular patterns, cyclical patterns.
how to deal with seasoality?
STL decomposition. Entails dividing into a trend component, a seasonality component, and some remainder.
Seasonal differencing is also widely used.
elaborate on seasonal differencing
Y_t’ = Y_t - Y_{t-s}
Y_t’ is the differenced series. The real differnece sort of. for instance, we can compare a value from january with only january and so on
why do seasonal differencing?
removes seasonal patterns, this helps us to identify trends and stationarity.
This also prepares our data for ARIMA modeling
what is linear tiem series
A tool for analyzing the dynamic structures of a time series. It is linear, so it is restricted to linear relaitonships
what do linear tiem series models use?
Prior information to some point. Typically historical values of the same variable that we are forecasting for,
elaborate on the foundation of time series analysis
Stationarity.
We have strict and weak.
Strict stationarity requires that the joint distribution of a set of variables, (consecutive in the time series) and the corresponding lag-l set of variables, are equal. This is extremely strong and hard to verify empirically.
Weak statioanrity requires:
constant mean
constant lag-l autocovariance. (includes lag0, variance, constant variance).
why do we need stationarity
it allows us to make predicitons
the lag-l autocovariance has 2 important properties
1) lag-0 autocovariance == variance == constant (if weak stationariy)
2) lag-(-l) == lag-l autocovariance
if two variables have 0 correlation, what does this mean?
it means that they are independent (if normally distributed)
Elaborate on ACF
AAutocorrelation function.
We denote the lag-l autocorrelation as “p_l”.
p_l = gamma_l / gamma_0
elaborate on requiresments for p_l_pred to be a consistent estimate of the true lag-l autocorrelation
This is weird because it is framed weird.
The point is that the sample ACF is a consistent estimator for the true ACF under certain conditions. These are met with weak stationarity.
Now, if the shock sereis in our time series happen to be iid wiht mean 0, we now that it will be normally distributed with mean 0 and variance either given by 1/T or bartlett.
This is important because we can use this to test for each autocorrelation (for each lagged variable), and see if it is statistically different from 0.
If lag-l autocorrelation coeeficient is extreme, we will reject the null hypothesis, which is equivalent to saying that there is some structure in the lag-l variable and the current variable.
this is done wiht a regular t-test.
Furthermore, when we have established that there is a correlation between r_t and r{t-l} then this is evidence that we should include a term in our model that use the shock from t-l.
Elaborate on the crucial part of the sample lag-l autocorrelation function
if the time series is iid and we have a finite second moment lower than infinite, then the sample lag-l autocorrelation function is asymptotically normal with mean 0 and variance 1/T.
This is crucial because this is THE foundaiton for testing the null hypothesis of p_l = 0 so that we can figure out if there is presence of autocorrelation or not.
Again, this card is weird. The above is about testing for white noise series, basically.
If we want to test for statistical singificance of ACF in a regular weakly stationary time series, we would use Bartletts formula for the variance. Under the specific conditions that we have a weakly stationary time series, we have that the ACF’s are asymptotically normally distributed with mean 0 and variance as bartlett.
give the t-ratio test for sample lag-l autocorrelation
The test statistic is the t-ratio, which is:
t ~ ^p_l / SE(^p_l) = ^p_l / (1/sqrt(T)) = sqrt(T) ^p_l
This test is basically testing for the white noise conditions.
elaborate on Bartlett
Says that if r_t is a weakly stationary series satisfying the linear requriement, where a_t is a white noise series, then the sample lag-l autocorrelation function is asymptotically normal with variance given by bartletts formula.
The result is that if the time series is linear and weakly statinairy, we use bartlett.
How can we test for individual lag-l autocorrelations for statistical significance?
We use t-ratio with Bartlett.
This is two-sided test.
important thing to remember regarding the sample ACF
It is biased in small samples. biased with factor of 1/T. This can be quite large in small samples, but is not an issue with larger.
“disadvantage” with using t-ratio for testing ACF
test one at a time.
how can we speed up the testing process
We make it more general and use Portmanteau testing.
There is the traditional Q* statistic, and the Ljung Box statistic
elaborate on Q* statistic
A portmanteau test used to test multiple lags for autocorreltion.
Q*(m) = T∑^p_l^2
We square the sample autocorrelations, sum together, and multiply by the sampel size. This statistic is asymptotically chi squared with “k” degrees of freedom, where k is the number of lags we inclde. we multiply by T to account for the bias of 1/T.
One sided test. Rject is value is very large.
The null hypothesis is that all lags are 0.