Chapter 6 - Univariate time series modeling Flashcards

(50 cards)

1
Q

define univariate time series modeling

A

class of specifications where you attempt to describe changes in a random variable using only information contained in their own past values and possibly current and past values of an error term

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what does time series modeling contrast with

A

Structural models. Structural models, like OLS, is multivariate in nature.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

is time series modeling theoretical?

A

Typically a-theoretical, meaning that it is not common to base it on a theory. Meaning, you are not going to use financial theory to establish the structure of one of these models. Contrasts with structural models and their general-to-specific case, as it provide a way to model the theoretical foundations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

if time series mdoeling doesnt use known theory, then what does it do?

A

It is based on using empriical observations to extract patterns.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what benefit can tiem series mdoels give that structural models does not?

A

Out-of-sample estimations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what model is the goal of this chapter?

A

ARIMA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

most important topic in time series modeling

A

Statonarity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

why stationarity important

A

behavior and properties of stationary time series offer a make it or break it sort of case

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

elaborate on stationarity

A

two types:
1) Strict
2) weak

A strictly stationariy process is one where for every t1, t2, t3,….,tt, and any k, the joint probability distribtuion of t1,…tt is equal (identical) to t_(1+k), t_(2+k),…t_(t+k).

This entails that the probability of “y” falling in some specific interval is the same now as in any period.

Weakly stationariy is more observable and practical. Requirements are:
1) E(y_t) = mu
2) E((y_t - mu)(y_t - mu)) = sigma^2 < infinity
3) E[(y_t1 - mu)(y_t2 - mu)] = gamma_(t2-t1) for all t1 and t2

These three state: constant mean, constant variance, constant lag-l covariance. (constnat autocovariance structure).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

elaborate on autocovariance structure

A

We refer to lag-l autocovariance to describe the autocovariance between a variance, and its same variable “l” lags earlier. This is constnat in stationary time series.

Thus, if we have a stationairy time series, we have that autocovariance between y_1 and y_10 is the same as between y_10 and y_20.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what is the autocovariance function+

A

E[(y_t - E[y_t])(y_(t-s) - E[y_(t-s)])]

this is lag-s autocovariance function.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

elaborate on the interpretation of autocovariance

A

Nothing really, since it depends on the measurement of y_t. Therefor,e we use autocorrelation to relate meaning to it.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

how do we go from autocovariance to autocorrelation?

A

Correlation is defined as covariance divided by variance. In time series, the variance is the lag-0 = gamma_0 autocovariance, so we can neatly represent it as:

corr_l = gamma_l / gamma_0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

elaborate on acf

A

autocorrelation function.

when we plot all the lag-s autocorrelations up to some lag k, we can what we call the acf.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

elaborate on a white noise process

A

structured random pattern.

Has the following properties:
1) E[y_t] = mu
2) var[y_t] = sigma^2
3) gamma_{t-r} = sigma^2 if r=t else 0

So, the white noise series is completely memoryless. Each observation is completely independent of all previous values it made.

If the y_t white noise series has mean 0 and fowllow a normal distribution, then the sample autocorrelation coefficients are approximatley n(0, 1/T).

This is very useful, because given these properties, we can test for whether a time series is white noise or not. Specifically, given a time series and a coefficient for autocorrelation, we can test it against having zero mean and 1/T variance using a simple ratio of

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Elaborate on the Q-statistic

A

Q statistic is given by:

Q = T ∑t_k^2

where t_k is the autocorrelation coefficient for lag k.

This statistic is asymptotically chi squared distributed with degrees of freedom equal to the number of variables in the sum.
This stems frlom the assumption that the acf’s are normally distributed, so that a sum of their squares give chi squared variable.

The hypothesis (null) is that all are 0, as it is a joint test. If as much as one is significantly not zero, we reject the null, which basically means that we have evidence of it NOT being a pure white noise series.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

weakness of Q-statistic

A

struggle from small samples.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

another name for Q-statistic

A

Box-Pierce test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

can we imrpove Box-pierce?

A

Ljung-Box.

makes it better in small samples. Asymptotically, they will converge.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

what is portmanteau test

A

general test, that test multiple hypothesis at once

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

elaborate on using the fact that white noise autocorrelation has variance 1/T to make a test case for whether a single observed autocorrelation piece is insignificant or not

A

we’d derive this by starting at:

x = n(0, 1/T)

standardize:

(x-0)/sqrt(1/T) = n(0,1)

=> x = standard_normal_variable * sqrt(1)/sqrt(T)
=> x = s_n_variable * 1/sqrt(T)

And we can get the bound by figuring out what the standard nromal variable value must be in order to make it 95% probable to be achieved, which is 1.96. So, if we see that x takes on a value greater than 1.96*1/sqrt(T) then it is extremely large and likely not white noise according to the standard deviation of sqrt(1/T)

22
Q

what is power in hyp testing

A

P(reject H0 | H1 is true)

23
Q

introduce moving average process

A

linear combination of white noise series.

A variable y_t depends on current and previous values of the white noise process.

24
Q

discuss the lag operator

A

Ly_t = y_{t-1}

L^{i} y_t = y_{t-i}

Also referred to as “backshift operator”

25
use lag operator to give the shape of moving average process
26
movign average models and processes have a ocnstant term as well. However, we drop it during our calculations. Elaborate on the issues that this carry
No issues. This is because we can achieve a zero-mean time series by simply subtracting the mean from each point. Granted that the mean is constant. This allows us to remove the constant term, which significantly ease our computations.
27
elaborate on the properties of a moving average process
by definition, the MA process has the following properties: 1) constant mean 2) Constant variance 3) autocovariances that may be nonzero up to the q'th order, and strictly 0 after that (because the MA(q) process simply doesnt use any other variables). The expected value of the MA(q) process is simply the constant term. To see this,
28
what is Wold's decomposition theorem?
Any stationariy time series can be decomposed into the sum of two unrelated parts: 1) Deterministic 2) Stochastic The stochastic will be a MA(infinity) series.
29
regarding MA processes, why doesnt the first (current) shock have a coefficient that is differnet than 1?
We could make it like that, but it makes things harder. Instead, we divide on this coefficient, which will re-scale the other coefficients and the mean. The outcome is the same.
30
what is the motivation for learning MA
1) building block for ARMA and ARIMA 2) It describe a relationship where a variable is described by a constant term and some sort of structural oscillation around this level that is predictable. The structure is predictable, but the magnitude is not. and a key part of MA processes is that they cut off hard on their order. This means that for instance an MA process provides a relationship between: - mean level - latest shock/innovation it is important to realize that most time series will likely not be MA processes, but might have components of it.
31
elaborate on the memory of an MA process
short. only up to its order.
32
what happens with the cross products in MA process computations?
When used together with exepcted value, like E[cross_products] we get 0 because the covariance of the error terms are 0 as they are white noise.
33
what do we need to compute the acf of a true MA process?
We need to understand the expectation formula for autocovariance, and for variance. This allows us to create teh formula for correlation.
34
elaborate on autoregressive process
AR, looks like this: y_t = mu + ∑ø_i L^{i} y_t+ u_t
35
what is the stationarity condition for MA models?
They are always stationairy provided that the errors are actually white noise.
36
dicuss on stationarity and AR
if an AR model is not stationry, it will explore in value. Therefore, stationarity is crucial. AR models have statipnarity conditions.
37
derive the stationairty condition for AR(p) model
we use notation: ø(L) y_t = mu + u_t where ø(L) = (1 - ø_1L - ø_2L^2 - ... - ø_p L^p) Setting the mean to be 0, which is easily achieved by a subtraction, we'd get: ø(L) y_t = u_t Then we can isolate y_t: y_t = (ø(L))^{-1} u_t The process is stationary if this equation holds and ø(L)^-1 converge to 0. Meaning, that the autocorrelations will decline as we increase the lag length. if all roots to the characteristic equation ø(L) lies outside the unit circle, then AR mdoel is stationairy.
38
if a root is on the unit cirlce, is it stationairy or not?
Not . Has to be strictly outised
38
what is the outcome of Wold's decomposition theorem?
If we have an AR process that is stationary and zero mean, it is equivalent to an MA(infinity) process
39
elaborate on the unconditional mean of AR process
assuming stationarity, it is given as: E[y_t] = mu / ø, where ø is the ø(L) without the L parts
40
what can we say about the acf of AR process?
if stationary, it will decay to zero
41
elaborate on Box-Jenkins
method for building ARMA models. step 1) Use graphical methods (acf, pacf) to spot the order of each component step 2) estimate the parameters. Typically done by MLE. Step 3) Model checking. two parts are important here: one is overfitting. The other is residual diagnostics. Overiftting is not traiditona lsense of the word, but somethng else. The residual diagnostics refer to checking the residuals for linear dependencies, which if present would indiacate that the model has not been able to capture the patterns.
42
is the Box jenkins perfect?
Book say use graphical methods to find order, but then they say that complex cases require information criteria to select order. Reasoning for this is that PACF and ACF are difficult to interpret oin messy data.
43
elaborate on information criteria
balance residual sum of squares with a penalty of having a higher order.
44
what is the difference between various information criteria?
They have the same residual sum of square (variance) part, but differ in how stiff the penality for higher orders is
44
which information crietria should be chosen?
None is better than the other, but htey have differnet properties. AIC is not consistent, but is generally efficient. SBIC is consistent, but inefficient.
45
what is ARIMA
an ARMA model whose autoregressive component's characteristics equation has a root on the unit circle. ARIMA(p,d,q) is the same as ARMA(p,q) differenced d times. The concept of differencing is to remove non-stationarity. When testing model orders, one could check for ARIMA and check differencing until it checks out. BUt it is more common to difference the time series first, and then fit ARMA model.
46
does differnecing work on all kinds of non stationary patterns?
No. it works best on trends. It is shit (useless) for cyclical patterns.
47
48