Flashcards in Time Series Deck (67):
What is Time Series data?
An ordered sequence of observations, typically equally spaced in time, possible through space as well.
What is the general equation for statistical forecasting?
Time Series (Y) = Signal + Noise
What is the difference between Trend, Seasonal and Cyclical patterns?
1. Trend: Naturally going up or down2. Seasonal: something that happens w/ a consistent frequency and definitive pattern (Temp over 24 hours)3. Cyclical: Natural ebb and flow but not w/ consistent frequency (US economy)
What are two methods of Time Series decomposition and 3 common decomposition techniques to calculate pattern effects?
Decomposition:1. Additive: Y = Trend + Seasonality + Error2. Multiplicative: Y = Trend * Seasonality * Error or log(Y) = log(T) + log(S) + log(E)Decomposition Techniques:1. Classical Decomposition2. X-12 ARIMA3. STL (Seasonal, Trend and LOESS Estimation)
What is autocorrelation?
Correlation of a variable iwth itself across time.
What is White Noise?
If all signal (pattern) has been accounted for then the errors will be independent (no pattern in residuals). This is White Noise. White noise time series follows a Normal distribution with mean of zero and positive, constant variance in which all obs are independent after accounting for pattern.
What is the test for White Noise?
"Ljung-Box test. "
What does LOESS stand for?
LOcal regrESSion. Builds regression line on small sections of graph, can handle outliters very well b/c it can smooth over averages better.
In time series, what is a hold-out sample?
Always at the end of the time series data, doesn't typically go beyond 25% of the data. Ideally an entire season should be captured in a hold-out sample. You need to hold out whatever you've been asked to forecast. year, quarter, month, week, day. 1. Create training and validation data set2. Derive a set of candidate models3. Calculate the chosen accuracy statistic by forecasting validation data4. Pick model with best accuracy
What are the 4 model diagnostic stastics used to evaluate TS model accuracy?
"Can also use Information Criterion methods: AIC, SBC, "
Describe the difference between goodness of fit and accuracy?
goodness of fit is calculated on training dataaccuracy is calculated on hold-out data
What are the characteristics of a good time series model?
1. Highly correlated with actual series values2. Exhibit small forecast errors3. Capture important features of the original time series.
What is the goal and statistical significance of Exponential Smoothing Models?
Goal is prediction of one time period into the future. math is focused on finding best THETA, bounded between 0 and 1. Focus weights most recent data with lower weight the further back in time. No statistical significance as this was developed by mathematicians, not statisticians, so no statistical distribution in mind.
What are 3 Exponential Smoothing Models and what are they focused on?
1. Single2. Linear/Holt (Trend)3. Holt-Winters (Trend and Seasonality)
How is the optimal THETA found for ESMs?
THETA that minimizes sum of squared errors is chosen.
What's the difference between time series decomposition, ESM and ARIMA?
Time series removes noise and leaves trends and explores dataESM models on time period into the future for forecatingARIMA is statistically based and allows patterns to reveal themselves through correlation and stationarity.
What correlation does Autocorrelation describe?
The correlation b/w Y ant Y(t-1), same variable separated by k-points in time.
What is implied if the first correlation function is significant?
2 consecutive points are correlated.
What does a positive or negative AFC(1) imply?
Positive: High today implies high tomorrow (continued trend)Negative: High today implies low tomorrow(reversal of trend)
What's the difference b/w the autocorrelation function (ACF), partial autocorrelation function (PACF) and the inverse autocorrelation function (IACF)?
ACF: function of all autocorrelation b/w 2 sets of obs through time for all values of k (time steps)rho(k) = Corr(Yt, Yt-k)PACF: correclation b/w 2 sets of obs separated by k-points in time after adjusting for all previous autocorrelations b/w the two points. PACF is conditional and tries to measure the direct relationship b/w 2 sets of obs w/o the influence of other sets in between. Uses regression to identify relationships. phi(k) = Corr(Yt, Yt-k | Yt-1, Yt-2,..., Yt-k-1)IACF: overemphasizes seasonal effects - very helpful in identifying seasonal data. Similar to PACF but different method of calculation using linear algebra. Typically has opposite sign to PACF.
How is PACF like linear regression?
Run multiple linear regression with your lags and the coefficients are the partial autocorrelation (phi). Using regression isolates the effects of each lag.
What's the relationship b/w PACF and ACF?
They are complementary, using both you can determine how far back the relationship really goes.
What are the naive models for regression and time series?
Regression: mean of yTime Series: y(t-1), the previous time period.
What is stationarity?
Like independence assumption in regression, stationarity should exist in the background. A stationary time series has constant mean and variance. A time series w/ long term trend or seasonal data cannot be stationary b/c the mean of the series depend on the time the value is observed.There must be randomness of the mean, this is a condition of stationarity.
For season data, the mean is 0. Is there stationarity?
There is no randomness to the mean, so a pattern exists - no stationarity.
What do all stationary models revert to?
the mean of the series.
How do we account for non-stationarity?
Use Dickey-Fuller Test to decide if Deterministic or Stochastic. Trend:- Linear regression - Deterministic Trend- Differencing - Stochastic TrendSeasonality- Linear Regression - Residuals are Stationary- Differencing - Stochastic trend on Seasons
What are two types of Trend?
Deterministic - mathematical function of time (Linear, quadratic, etc)Stochastic - future values depend on past values+error
In trending models, what does the forecast revert to?
The linear trend (regression model) not the mean. The residuals revert to their mean which is 0.
What is a common Stochastic Trend model?
Random walk with drift. No matter what time period you are in, the previous time periods effict it directly b/c the coefficient of Y(t-1) is 1.
How is Stochastic Trend addressed?
Pattern exists in the difference of Y(t) - Y(t-1). This is called Differencing.
What is the order in which a modeler should address Trend and Seasonality?
Season first, then Trend. Sometimes fixing seasonality will also fix a trend.
Describe stochastic and deterministic seasonality.?
Stochastic: future seasonal values depend on past seasonal values + error. Deterministic: mathematical function of seasonal dummy variables or trig function.
What is over-differencing and why is it a problem?
Over-differencing is the process of taking a difference in the presence of a deterministic trend or taking too many differences in a Stochastic trend. It's a problem b/c it introduces more dependence on error terms in your model. Create moving average models that don't exist.
How do you know when to take a difference?
Augmented Dickey-Fuller (ADF) Unit Root Test. Ho: non-stationarity (differencing required)Ha: 3 forms: Zero Mean, Single Mean, TrendIf you have data that is not seasonal!!!!!
What statistics are used to determine p-values for the Augmented Dickey-Fuller test?
Lag0 - Rho>= Lag1 - TauDo not use F-Test
What is meant by short, moderate or long term memory as it applies to a stationary time series?
How long does the effect of an observation persist in the model.
What is the fundamental relationship underlying Autoregressive (AR) models?
"That a relationship b/w Yt and Yt-1 exists for all one time period differences across the dataset. Even long ago points have a small influence. Ideas that you can recursively solve for Yt. "
For AR models, how does PHI impact the model?
Effect of long ago shocks has little effect on the present only if abs(PHI) < 1. If 1 then random walk.
Describe difference in correlation functions between AR(1) and AR(P).
AR(1)- ACF: Exponential decrease exponentially- PACF/IACF: Significant spike at first lag, then nothingAR(P) - Autoregressive Process of order P- ACF: Variety of Patterns- PACF/IACF: Significant Spikes at significant lags up to p-lags, then nothing.
What are Moving Average models based on and what kind of events are they good at describing?
"MA models forecast based on past error values. Good for describing events whose effect only lasts for short periods of time. "
Describe difference in correlation functions between MA(1) and MA(Q).
MA(1)- ACF: Significant spike at first lag, then nothing- PACF/IACF: Exponential Decrease as number o lags increasesMA(Q) -Moving Average Process of order Q- ACF: Significant Spikes at significant lags up to p-lags, then nothing. - PACF/IACF: Variety of Patterns
For AR and MA models, does stationarity need to be achieved before modeling?
What is the relationship b/w AR and MA models?
"Complementary, AR and MA models are opposites of each other. In certain situations, an MA model can be represented as an inf AR model and vice versa. "
What are some automated selection techniques used for model identification for staionary models?
Minimum Information Criterion: Calculates a lot of Models and gives you the model with smallest BICSmallest Canonical Correlation (SCAN)Extended Sample Autocorrelation Function (ESACF): Both methods try to figure out correlation patterns for you.
How do AR and MA models revert to the mean over time?
MA - QuicklyAR - Slowly
What are the two estimation methods available for ARIMA?
Conditional Least Squares (CLS) Maximum Likelihood Estimation (MLE) - preferredChange MaxIter or go back to CLS if MLE does not converge.
How are Seasonal Dummy variables used in forecasting and what are drawbacks?
Method:1. Create dummy variables2. use regression to calculate effect of each variable3. Use regression output in ARIMAProblems:1. Dummy variables don't weight more recent observations higher2. Don't account for trend
What is the name of the most common modeling technique for Seasonal ARIMA models.
Box-Jenkins - applying an ARIMA structure on the seasonal effectARIMA(p,d,q)(P,D,Q)sp,d,q are orders of nonseasonal terms (ARIMA part)P is order of Seasonal ARQ is order of Seasonal MAD is ord er of Seasonal Difference Structures is length of seasonal period
What is AR(1,12) in Box-Jenkins notation? ARIMA(p=1,2,3,12)(q=12,24)?
Can you use the Augmented Dickey-Fuller test for seasonal unit roots?
ADF is hindered in the presence of seasonal unit roots, need to use Seasonal Augmented Dickey Fuller
What is used to eliminate a seasonal unit root?
A seasonal difference
What do theoretical Additive and Multiplicative correlation functions look like?
What is necessary to be able to add explanatory variables to a forecasting model?
Knowledge or forecasts of future values of the explanatory variables. Forecasting monthly sales, could have promotion indicator variable
What do event variables typically do to a model?
Primarily act as intercept shifters to accomodate discrete shifts, also called jumps or bangs in time series.
Describe the 3 types of intervention variables?
1. Point or Pulse Interventions - one time event that reverts to the original intercept (Sales Promotion)2. Step - an event occurred and now you're at a new intercept (Charging money for Directory Assistance)3. Ramp - takes time for effect to take place (new law enacted)
What is the difference b/w Deterministic and Stochastic intervention structures?
Deterministic - the effect had no lag structureStochastic - intervention could have a lag structure to how it dissipates across time.
For stochastic interventions, what are omega(B) and delta(B) and how do they determine how the intervention point influences the data?
B is backshift operatoromega(B)/delta(B)omega(B) for short time perioddelta(B) for daily effect of intervention lasts for an infinite period of time.
What are transfer functions?
Time series functions with covariates other than lags of Y in the model. Don't have to be at same time point as Y, can be lags of X.
Do you want X variables to lead or lag Y?
Lead. If lagging that would mean that Y predicts X.
In time series, what is cross-correlation?
the relationship between two time series (Yt and Xt), the series Yt may be related to past lags of X. Cross Correlation Function (CCF) is helpful for identifying lags fo the x-variable that might be useful predictors of Y. Ex. when it gets hot there is a lag in energy transfer
What is Pre-Whitening and how is it done?
process of making a series into white noise before putting it in the model. Correlation structure of the X variable could hamper our estimates of the cross correlation b/w X and YFirst develop a model for X that leaves only white noise, then Model
In cross-correlation what does a shift of 2 signify?
Y feels the effect after two time periods.
What are the 7 steps to the General Transfer Function Modeling Technique
1. Identify and estimate model for X2. Pre-whiten Y and X 3. Compute Cross-correlation4. Fit Transfer Function5. Model Remaining residuals of Y6. Evaluate model fit7. Forecast X and Y
Describe an Autoregressive Neural Network?
Typical neural network but with lags of Y added to input layer. Number of lags determined by Correlation plots.
Why is it better to average forecasts?
Biases a among the methods and/or forecasters will compensate for one another. This method is especially relevant for long-range forecasting where uncertainty is extremely high.