Reading 2: Time Series Analysis Flashcards

(35 cards)

1
Q

What is a time series? + examples

A

It’s a sequence of data points collected over time.
Think of it like: Tracking your monthly expenses or daily steps over a year.
The evolution of a data point over time

Examples:
* Quarterly sales over the last 5 years
* Monthly CPI over the last 12 years

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the main types of trend models in time-series analysis?

A

Linear Trend Model: Assumes a straight-line change over time.
Log-Linear Trend Model: Assumes exponential growth or decline.
Autoregressive Model (AR): Uses past values to predict future ones.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How do lagged models differ from trend models in time-series analysis?

A
  • Instead of using time as the predictor (like trend models do), lagged models use past values of the variable itself to predict future values.
  • They focus on how previous observations influence the current one.

Think of it like:
Predicting today’s temperature based on yesterday’s temperature, rather than just the date.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does a Linear Trend Model do? What would the independent variable be?

A

It predicts future values by adding a constant amount each time period.

Time is the independent variable.

Formula:
y = b0 + b1(t) +ε

Example: If your salary increases by £3,000 every year, that’s a linear trend.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is a Log-Linear Trend Model? + formula

A

It predicts values that grow (or shrink) at a constant percentage rate over time—ideal for modelling exponential growth.

It assumes the data grows (or shrinks) at a constant percentage rate over time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How would you use a Log-Linear Model to predict values at t=49?
b0 = 4 and b1 = 0.09

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

If you were to graph the predicted values from a Linear Trend Model, a Log-Linear Trend Model, and an Exponential Trend Model, how would the shapes of the lines differ over time?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is serial correlation in time-series trend models, and how does it affect the reliability of regression results?

A

Serial correlation (also called autocorrelation) occurs when the errors (residuals) from a regression model are correlated across time—meaning the error in one period is related to the error in another.

Why it matters:
* Violates a key regression assumption: that errors are independent.
* Leads to biased standard errors, which can distort hypothesis tests and confidence intervals.
* Makes the model less reliable for forecasting.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What should you do if your linear trend model shows serial (auto)correlation in the regression errors?

A

If a linear trend model shows serial correlation (i.e., errors are correlated across time), follow these steps:

Test for serial correlation using the Durbin-Watson statistic.
If serial correlation is present:
* Try a log-linear model instead.
* If the log-linear model still shows autocorrelation, switch to an autoregressive model (AR).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is Ordinary least squares (OLS) regression?

A

Is used to estimate the coefficient in the trend line

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the general rule on whether to use a log-linear model or linear trend model?

A

If the variable grows at a constant rate, a log-linear model is most appropriate

If the variables increases over time by a amount, a linear trend model is most appropriate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is an autoregressive model (AR) ?

A

A model in which the dependent variable is regressed against one or more lagged values of itself. It predicts a variable using its own past values.

** ie sales for a firm could be regressed against the sales for the firm in the previous month

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What does it mean for a time series to be covariance stationary, and why is it important for autoregressive models?

What are the 3 conditions neccesary to be considered covariance stationary?

A

A time series is covariance stationary if its behaviour stays stable over time.

  1. Constant and finite expected value (average/mean stays the same)
  2. Constant and finite variance (spread doesn’t change)
  3. Constant and finite covariance between values at any given lag (relationship between values over time is consistent)

Autoregressive models assume the data is stable.
If the data drifts or trends too much, the model can give misleading/meaningless forecasts.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How does an AR(1) model forecast future values using past data? Explain using yesterday, today and tomorrow.

Imagine x0 = 5, b0 = 1.2 and b1= 0.45, how would you calc the Monday and Tuesdays forecast?

A

An AR(1) model uses yesterday’s value to predict today’s, and then uses today’s prediction to forecast tomorrow’s.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What do you do if your AR model shows serial correlation? What wouldn’t you use?

A
  • Durbin-Watson is not valid for AR models.
  • Use t-tests on residual autocorrelations.
  • If significant autocorrelation exists, the model is incomplete.
  • Fix: Increase the number of lags (e.g., move from AR(1) to AR(2)) or adjust for seasonality.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Why does increasing the number of lags (e.g., from AR(1) to AR(2)) or adjusting for seasonality help fix serial correlation?

A

Because serial correlation means the model is missing patterns in the data.
Increase Lags (AR(1) → AR(2)):
* Adds more past values to the model.
* Captures short-term dependencies that a single lag might miss.
* Helps explain more variation, reducing leftover structure in the errors.

Analogy:
Predicting today’s mood using only yesterday’s might miss the influence of two days ago. Adding that second lag gives a fuller picture.

Adjust for Seasonality:
* Accounts for repeating patterns (e.g., monthly or yearly cycles).
* Prevents the model from mistaking seasonal effects for random error.
* Common fix: include seasonal lags (e.g., lag 12 for monthly data).

Analogy:
Ice cream sales spike every summer. If the model doesn’t know that, it might treat the spike as an error — when it’s actually seasonal.

17
Q

How do you test for autocorrelation in an AR(1) model, based on the example below and your findings what would you do?

How do you calculate the SE and t stat?

A

Compare to Critical Value
Assume a critical t-value of ±2.0 (95% confidence level):

Lag 2 t-stat = 2.3784 > 2.0 → Significant autocorrelation detected

The AR(1) model is incomplete — it doesn’t capture all the time-based patterns.
Fix:
* Increase the number of lags (e.g., move to AR(2)) to include more past values.
* Adjust for seasonality if patterns repeat over time (e.g., monthly cycles).

More lags help the model capture short-term dependencies that were missed.
Seasonal adjustments account for recurring patterns, reducing unexplained variation.

18
Q

What does it mean if a time series is mean-reverting? For an AR(1) model, how would you calc the MRL?

What does b1 represent and what does a small or big value mean?

A

The dependent variable tends to return to a long-term average over time.

If above the mean → expected to fall
If below the mean → expected to rise

b1 controls how strongly the past value influences the current one.

If b1 is close to 1, the series takes longer to revert.
If b1 is small, the series reverts quickly.

Analogy:
Imagine a ball rolling toward a resting point:
b0 is like the initial push.
MRL is where the ball eventually comes to rest.

19
Q

When comparing forecasting models, how can you tell which one performs better, and why is it important to use out-of-sample data?

A

Use the Root Mean Squared Error (RMSE) to measure how accurate a model’s predictions are.

RMSE tells you the average size of prediction errors — lower is better.
In-sample data is what the model was trained on.
Out-of-sample data is new data that tests how well the model generalises.

Why out-of-sample RMSE matters:

It shows how the model performs in real-world scenarios.
Helps avoid overfitting, where a model looks great on training data but fails on new data.

Bottom line:
Choose the model with the lowest RMSE on out-of-sample data — it’s more likely to make reliable forecasts.

Analogy:
Think of RMSE like a golf score — the lower it is, the better your aim (or prediction accuracy).

20
Q

What is regression coefficient instability, and how does it affect the reliability of time-series models over different time periods?

A

Regression coefficient instability means the estimated relationships in your model change over time.

This can happen if the economic environment shifts or the data-generating process evolves.
It creates a trade-off:
* Long time series offer more data but may be less stable.
* Short time series are more stable but may lack statistical power.

Why it matters:
* If coefficients aren’t stable, your model may not be reliable for forecasting.
* You may need to re-estimate the model or use rolling windows to adapt to changing conditions.

Analogy:
It’s like using last year’s map to navigate a city that’s constantly under construction — the roads may have changed.

21
Q

What defines a random walk in time-series data, and what does it mean for a random walk to have drift?

A

A random walk is a time-series process where each value depends entirely on the previous value plus a random shock.

The coefficient on xt−1 is 1, meaning the series does not revert to a mean and can drift indefinitely.

Analogy:
Imagine walking in a fog:
Without drift: You take random steps — sometimes forward, sometimes back.
With drift: There’s a gentle slope pushing you forward, so you tend to move in one direction over time.

22
Q

What is a unit root in an AR(1) model, and why must the coefficient be less than 1?

A

Unit root is when b1 = 0

For the model to be stationary (i.e., stable over time), the coefficient b1 must be less than 1 in absolute value:
If b1<1
The series is mean-reverting and stationary — it fluctuates around a long-term average.

If b1=1
The series has a unit root and becomes a random walk — it does not revert to a mean and can drift endlessly.

Analogy
Imagine a balloon floating in the wind:

If it has no anchor (unit root), it drifts wherever the wind blows — unpredictable and unstable.
If it’s tied to a post (stationary), it might sway, but it stays near the centre.

23
Q

How does the Dickey-Fuller test help you figure out if a time series has a unit root and is nonstationary?

A

The Dickey-Fuller (DF) test transforms the AR(1) model to check whether the series is just drifting randomly (i.e., has a unit root) or reverts to a mean (i.e., is stationary).

Dickey-Fuller Test Logic

  • Null Hypothesis (H₀): g1=0 → means b1=1→ unit root exists → nonstationary
  • Alternative Hypothesis (Hₐ): g1<0→ means b1<1 → no unit root → stationary - if this is true, reject

You calculate a t-statistic for g1 and compare it to special critical values (not the usual ones).

24
Q

What is first differencing in time-series analysis, and how does it help fix nonstationary data with a unit root?

A

First differencing is a technique used to transform a nonstationary time series into a stationary one — which is essential for reliable forecasting. It removes the drift.

Basic First Difference Formula:
* This is the raw change in the original variable xxx from one time period to the next.
* It’s a transformation — not a model yet.
* Useful for removing trends and unit roots.

AR-style Model of the Differenced Series:
* This is a regression model applied to the differenced data.
* It assumes that today’s change depends on yesterday’s change, plus a constant and some noise.
* Helps forecast future changes based on past changes in the original series.

25
Explain a useful analogy that reflects first differencing?
Imagine you’ve just started a new job and you’re tracking how long your commute takes each day: Week 1: You’re still figuring out the route — commute times are long and inconsistent. Week 2: You discover shortcuts and get more confident — commute times start to trend downward. This is like a nonstationary series — the values (commute times) are changing in a consistent direction. Now, instead of tracking your commute time, you track the daily change: Some days you shave off 2 minutes, other days it’s the same, or maybe slightly longer. These daily changes fluctuate around a stable average — this is your first-differenced series, which is stationary. Even if the original data trends (e.g., commute times getting shorter), the day-to-day changes can be more stable and suitable for modelling.exp
26
What effect does first differencing have on a trending time series?
It removes the upward or downward trend, flattening the series so it fluctuates around a stable mean. This makes it suitable for modelling with AR techniques.
27
Why Is a Unit Root a Problem?
The series becomes nonstationary — its mean and variance change over time. You can’t define a mean-reverting level (MRL becomes undefined) as the denominator becomes 0. Regression results become unreliable, and forecasts may be misleading.
28
What is the difference between in-sample forecasts and out-of-sample forecats?
In-sample are within the range of data (ie time period) used to estimate the model Out-of-sample are made outside of the sample period.
29
What is the **root mean squared error** criterion (RMSE) ?
Is used to compare the accuracy of autoregressive models in forecating out-of-sample values \*\*The model with the lower RMSE for the out-of-sample data will have lower forecast error and will be expected to have better predictive power in the future
30
What is the **random walk process**?
The predicted value of the series in one period is equal to the value of the series in the previous period plus a random error term
31
Do Random Walk or Random Walk with a Drift exhibit Covariance Stationarity?
No
32
What are the two tests to determine if time series is covariance stationary?
1. Run an AR model and examine the autocorrelations 2. Perform a Dickey Fuller test \*\*\*#2 is the preffered test.
33
What is it called when we transform a random walk time series to a covariance stationary time series?
First Differencing This involves subtracting the value of the time series in the immediately preceeding period from the current value of the time series to define a new dependent variable, y.
34
What is autoregressive conditional heteroskedasticity (ARCH) ?
When examining a single time series, if the variance of the residuals in one period is dependent on the variace of the residuals in a previous period.
35
What is **cointegration**?
that two time series are economically linked or follow the same trend and that relationship is not expected to change