L3 - Estimation of Regression Parameters Flashcards

Question 1

Q

How can the bivariate linear regression model be wrote?

Answer

A

Y_i=α + βX_i + u_i

i = 1, K, N

X is the independent or explanatory variable
Y is the dependent or explained variable
u is a random error or disturbance
α and β are parameters which characterise the relationship between Y and X. The parameters are not observable directly.

Question 2

Q

Why is regression analysis useful?

Answer

A

Regression analysis is the most important tool which economists use to quantify their models.
Economic theory provides explanations of linkages between variables of interest e.g. the relationship between consumption expenditures and disposable income.
However, theory rarely gives precise values for the size of the response of one variable to another. For this we must turn to econometrics and, in particular, to regression analysis.
The regression model provides a mechanism by which the response of one variable to another can be quantified and evaluated from a statistical perspective.
It therefore acts as one of the key items in the toolkit of the applied social scientist and the objective of this chapter is to discuss how it can be used sensibly in the investigation of economic relationships.

Question 3

Q

What are two interpretations of the regression model?

Answer

A

1 - The X values are chosen by the investigator e.g. by a process of experimentation.

In this case the X variable is not random and can be treated as being ‘fixed in repeated samples’ 2
The X and Y variables are jointly distributed random variables with cov(X,Y) ≠ 0 (covariance)
This is more realistic for economic data but harder to deal with when deriving the distribution of estimators

Question 4

Q

Who solved the issue of solving linear regression?

Answer

A

Mayer’s (1750) solution. Form linear combination of equations to reduce number of equations to number of unknown coefficients.
He would write out each value of x and y and its corresponding algebraic equation
each had the variable is an estimation of the regression line (α(hat) and β(hat))
He would then take an average of these to reduce the number of coefficient and solve ( in this example simultaneously)
These estimates are unbiased estimates of the population parameters.
However, there are an infinite number of linear combinations which are consistent with this procedure.

Question 5

Q

What is the Method of Least Squares?

Answer

A

An estimator is a rule for calculating an estimate of an unknown value using observable data. Mayer’s method gives us a possible estimator but this is not unique.
An alternative method is to choose estimates of the parameters which minimises the residual sum of squares:

min{α(hat),β(hat)} RSS = Σ^N_i=1(Y_i -α(hat)- β(hat)X_i)²

This is the least-squares estimator or, as it sometimes referred to, the ordinary least squares (OLS) estimator.
OLS provides a simple method for the generation of such estimates which, under certain assumptions, can be shown to have the desirable properties that the estimates are both biased and efficient (in the sense that they have the lowest possible variances in the class of unbiased estimators) –> lowest value of d

Question 6

Q

Who introduced the Method of Least Squares?

Answer

A

This method was first introduced by Legendre in 1805. It improves on Mayer’s method because the variance of the parameter estimates is the lowest possible.

Question 7

Q

What are the least-squares normal equations?

ADJUST

Answer

A

Minimising the residual sum of squares yields the following pair of equations known as the least-squares normal equations.:

α(hat)N +β(hat)ΣX_i = ΣY

α(hat)ΣX_i+β(hat)ΣX_i²=ΣX_iY_i

where i = 1,…,N Solving these equations yields the least-squares estimates:

α(hat)= Y(bar) -β(hat)X(bar)

substituing this into the second equation above gives:

β(hat)= (Σ_i=1^N(X{i}-X(bar))(Y{i} -Y(bar)))/(Σ_i=1^N(X{i}-X(bar)^2)

OR

β(hat) = cov (X,Y)/var(X)

Question 8

Q

How can the OLS estimates be calculated?

Answer

A

Calculate the slope coefficient as the ratio of the sample covariance of X and Y to the sample variance of X –> solve for β(hat)
Calculate the intercept using the property that the regression line passes through the sample means of the data (X(bar) and Y(bar) –> solve for α(hat)

Question 9

Q

What is the difference between α/β and α(hat)/β(hat)?

Answer

A

α and β are population parameters - 𝛼(hat) and 𝛽 (hat) are estimators of the population parameters based on sample data. - The estimators are random variables because they are constructed from the random variables Y and (possibly) X. - The population parameters are not random variables. They are unknown/unobservable parameters which we must estimate using the data available.

Question 10

Q

What do the different parts of the OLS estimators mean?

Answer

A

Mean for X{i} = X(bar) = Σ_i=1^N(X{i})/N) - Mean for Y{i} = Y(bar) = Σ_i=1^N(Y{i})/N) - Deviations of X from mean = (X{i}-X(bar)) ∀i - Deviations of Y from mean = (Y{i}-Y(bar)) ∀i - Squared Deviations of X from mean = (X{i}-X(bar))^2 ∀i - Squared Deviations of Y from mean = (Y{i}-Y(bar))^2 ∀i

Question 11

Q

What is Maximum Likelihood?

Answer

A

The method of maximum likelihood is an alternative way to generate estimates of the unknown parameters. It begins by making an assumption about the distribution of the errors.

Y{i}=α + βX{i} + u{i}

u{i}~N(0,σ{u}^2) E(u{i},u{j}) = 0 ∀ i≠j

The errors are assumed to be independent, identically distributed (iid),normal random variables - if data the data collected is (iid) then it is said to be a random sample
In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of a statistical model given observations, by finding the parameter values that maximize the likelihood of making the observations given the parameters
e.g. if we had a set of data which is normally distributed - what values of μ and σ², is most likely responsible for creaying the data points that we observed

Question 12

Q

What is the PDF for the errors in the Maximum Likelihood model?

Answer

A

f(u{i})= (1/sqrt(2πσ{u}^2) * exp((-(Y{i}-α -βX{i})^2))/(2σ{u}^2)))

Question 13

Q

What is the likelihood function?

Answer

A

L(α,β,σ{u}^2)= Π_i=1^N (1/sqrt(2πσ{u}^2) * exp((-(Y{i}-αβX{i})^2))/(2σ{u}^2))) - this shows the joint probability of the errors in PDF form - Taking logarithms of this gives us the log-likelihood function

Question 14

Q

What is the log-likelihood function?

Answer

A

LL(α,β,σ{u}^2)= -(N/2)Ln(2π) - (N/2)Ln(σ{u}^2) - Σ_i=1^N(Y{i}-α -βX{i})^2))/(2σ{u}^2)

Question 15

Q

What is the method of maximum likelihood involves?

Answer

A

The method of maximum likelihood involves choosing estimates of the population parameters which maximise the log-likelihood function. - The first order conditions for a maximum are:

dLL/dα = 1/2σ{u}^2*Σ_i=1^N(Y{i}-α -βX{i}) = 0
dLL/dβ =1/2σ{u}^2*Σ_i=1^N(Y{i}-α -βX{i}) = 0
dLL/dσ{u} =-N/2σ{u}^2 1/2(σ{u}^2)^2*Σ_i=1^N(Y{i}-α -βX{i})² = 0

Question 16

Q

For the method of maximum likelihood what do the first two first order conditions yield?

Answer

Study These Flashcards

A

These are identical to the least-squares normal equations. Therefore, for normally distributed errors, least squares and maximum likelihood give identical parameter estimates.

Σ^N_i=1Y_i = α_ML(hat)*N + β_ML(hat)Σ^N_{i=1<span>X</span>}_i

Σ^N_i=1X_iY_i = α_ML(hat)*Σ^N_i=1X_i+ β_ML(hat)Σ^N_i=1X_i²

Question 17

Q

For the method of maximum likelihood what does the third condition yield?

Answer

Study These Flashcards

A

This is different from the formula normally used for the variance of a least squares regression because it does not adjust for the loss of degrees of freedom when estimating the other regression parameters.

The maximum likelihood estimator of the error variance will be biased in small samples. However, the bias will tend to zero as the sample size becomes large.

(σ_u²)_ML= 1/N*Σ^N_i=1(Y_i - α_ML(hat)- β_ML(hat)ΣNi=1Xi)²

Question 18

Q

What are degrees of freedom?

Answer

Study These Flashcards

A

Degrees of freedom of an estimate is the number of independent pieces of information that went into calculating the estimate. It’s not quite the same as the number of items in the sample –> d.f.= n-1 for 1 sample under the t-distribution - Another way to look at degrees of freedom is that they are the number of values that are free to vary in a data set.

Question 19

Q

When do you use OLS or ML?

Answer

Study These Flashcards

A

when the number of observations are large enough you can use both - but when the number of oberservations are small us OLS as ML is biased?

Question 20

Q

What are the regressional residuals defined as in the ML method?

Answer

Study These Flashcards

A

The regression residuals are defined as the difference between the actual values and the fitted values from the regression model:

u{i}(hat) = Y{i}-α(hat)-β(hat)X{i}
These will be the same for both OLS and ML estimates.
Note that these are not the same as the equation errors whichdepend on the unknown population parameters rather than the regression parameter estimates. Note also: Σ_i=1^N(u{i}(hat))=0 Σ_i=1^N(X{i}*u{i}(hat))=0 These are true by construction

Question 21

Q

What is the difference between regression residuals and errors?

Answer

Study These Flashcards

A

error –> is the difference between the data observed and the population regression line using the actual values of α and β
residual –> is the difference between the data and the sample regression line using the parameter estimates: α(hat) and β(hat)
residual can be referred to as an estimation of the errors

Question 22

Q

What is standard error?

Answer

Study These Flashcards

A

sometime referred to as standard error of the mean, it is the variability of the mean of data from different samples taken from a single population
this is the most basic version but in general, if you have multiple sets of data e.g. medians, and found the standard deviations of of them, you would have found the standard error
it is the standard deviation of multiple sample moments taken from one population
standard error, in the standard deviation of sample statistics

L3 - Estimation of Regression Parameters Flashcards

(22 cards)