Slide coverage Flashcards

(33 cards)

1
Q

there is a CLRM assumption that can replace two others. elaborate

A

E[u_t | X_t] = 0

This can replace E[u_t]=0 AND cov(x_t, u_t)=0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

how to ensure that assumption 1 holds

A

Include a constant term

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what is endogeneity? relate to exogeneity

A

Exogeneity refers to the assumption that exogeneuous variables are not correlated with the error terms. In other words,: E[u | X] = 0

Endogeneity is a condition where exogenity is violated.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

why would endogeneity be present?

A

This is the same as asking for reasons why the explanatory variables and error terms might be correlated.

If we are missing an important variable, both the constant term and the explanatory variables can try to explain this movement without actually being related. Unless if completely uncorrelated. in that case, only the constant will be affected.

Measurement error is also a source.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Endogeneity is present. What do we do?

A

The obvious one is to account for missing variables.

if we have panel data, this can help leverage common effects.

IV estimation.

Heckman Correction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

elaborate on instrumental variables

A

This is a method for endogeneity.

First we split the X into two parts.
1) the variables not correlated with error term
2) the varibles that are correlated with the error term

Then we extract the part of X that is NOT correlated. We use this to get a consistent estimate of the variation in Y.

We want to use instrumental variables. These are correlated with the endogenuous variables of X, but not the error terms.

We find some other variable, call it Z. we want this to be correlated with the endogenuous variables, but not the error terms.
We would have used this Z directly, but we want to understand the effect that X (endo) has on Y. This is why we go to greater lengths to include it.

We use Z in a spearate regression that will predict values for the endogenuous variable(s). these are then used.

This is referred to as two-stage as well. first regress on the IV, then regress on the real shit.

It is worth noting that good IV’s are usually difficult to find.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

elaborate on the possible forecasting methods

A

We have 3)
1) Scaled
2) Iterative
3) Direct

Direct skips directly tpo the forecasting horizon, but iterative goes point by point. This leads to different results.

Scaled predicts 1 step ahead and then scales it with the number of steps we actually want. Does not perform well.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what is realized volatility?

A

Realized volatility entails getting a on-the-spot measure of the volatility by utilizing high frequency returns like 5min intervasl, compute the corresponding volatility and scale to daily. This gives us a daily volatility estimate that represent the current level sort of .

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what is MIDAS

A

Mixed Data Sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what does MIDAS do

A

it is a regression technique that allows for data sampled at different frequencies to be used in the same model.

Specifically, we sample the depdnent variable at a lower frequency than the independent ones.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

the price at which electricity is sold is determined by the ?

A

merit order principle

In europe, it is day ahead markets

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what is merit order?

A

a way of ordering available sources of energy.

These are ordered from lowest to highest marginal cost to determine which ones should be produced first to supply a given demand at minimum cost for the system.w

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

give the merit order ordering

A

solar, wind, nuclear, water, some sort of burning, coal, oil

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

why water not at 0?

A

Because it is really bad if we exhaust the resource

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

does solar actors and offshore wind actors earn the same?

A

No, because hte produce electricity at different times of the day. Wind ismostly at night. solar mostly at day. there is also different demand at these times which result in differnet clearing price.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

elaborate on interpretation of the constant term

A

we need to be careful with this. It is very common that the constant intercept the y-axis at a point where we basically have no data points. In these chases, it doesnt really makes much sense to interepret any base level.

17
Q

what can we do if we weant to test relationships that are more intricate?

A

We need to use testng under maximum likelihood.

this includes the 3 boys:
1) Wald
2) LR
3) LM

18
Q

what is the basis for which the ML testing works?

A

checking if the likelihood value drops when we enforce a restriction

19
Q

what can we do about heteroskedasticity?

A

If we happen to know the shape, we can do somethin

20
Q

say the hteroskedasticity depends on: var(u_t) = sigma^2 z_t^2

What can we do?

A

divide the entire regression on z_t^2. this gives us error terms that are constant in variance.

21
Q

elaborate on Durbin Watson

A

Tests a single lag for autocorrelation.

DW =ish 2(1-p^pred)

Since correlation is between -1 and 1, DW is between 0 and 4.

Values close to 2 represent no correlation.

There are also cirtical values that we need to find, and these will be somewhere between 0 and 2 and between 2 and 4.
There are actually 2 sets of critical values: lower and upper.
We need to be beyond the lower to say that there is autocorrelation

22
Q

what is the error-in-variables problem?

A

Measurement error in the explanatory variables. this is a serious problem and will cause violation of assumption.

23
Q

what is 2SLS?

A

two stage least squares.

Refers to first running a regression to predict some endogenuous variable by using some instrumental variable. Then when this is done, we replace the endogenuous, and run a new actual regression.

24
Q

elaborate on Heckman correction

A

Heckman correction is a technique used in regards to fixing sample bias. It works best when there is a systematic sample bias.

25
what are the stylized facts of financial returns we are working with
1) Nonstationary prices (random walk), stationarity of returns 2) Absence of autocorreltion of returns 3) Autocorrelation in squared returns 4) volatility clustering large returns tend to be followed by large returns and vice versa 5) Fat tailed distribution of returns, kurtusis of larger than 3, (leptokurtosis) 6) Leverage effects, negtive returns tend to increase volatility more than positive returns
26
what specifically are the principal components?
The eigenvectors of the covariance matrix
27
what do principal components represent?
axes along which the data has the most variance
28
how to rank the principal components?
We use their eigenvalues. Largest = largest/first principal component
29
how can we reduce our data but keep say 95% of the variance?
∑lambda_i [i=1, k] / ∑lambda_i [i=1, p] >= 0.95 Here, lambda_i is the eigenvalue of PC i. we assume that the components are ordered.
30
how do we find the covariance matrix that is used for the eigenvalue-vector probkem?
find coviarnace using sample covariance, then center it.
31
after computing the eigenvectors and values ,then wehat?
we project the data we have onto the new axes that we have chosen. if we have a data vector X, we perform: Z_k = X^T v_k where v_k is the k'th principal component vector. More typically, we have a data matrix X and multiply it by the PCA matrix that consist of column stacked eigenvectors. The number of columns of the data matrix will always match the number of rows in the eigenvector matrix, so the matrix product is well defined. The outcome Z_k is a new matrix that has projected the data onto the new axes.
32
how to determine how much variance is caputed by a component?
we use lambda_i, the eigenvalue, and divide on the sum of all eigenvalues. this is the ratio of explained variance.
33