Unit 3 : Simple Linear Regression Model Flashcards Preview

Econometrics 1 > Unit 3 : Simple Linear Regression Model > Flashcards

Flashcards in Unit 3 : Simple Linear Regression Model Deck (14)
Loading flashcards...

What is the difference between estimators and parameters?

When you want to describe something using population data, we call these measures parameters.
When you want to describe something using sample data, you call those descriptive measures estimators.


What is a regression residual?

This is the difference between a data point and the regression line.

The regression line predicts what 'should' be (predicted values) plotted and hence the residuals help us to determine if we have truly evaluated all the variations that could exist in the dependent variable outcome ( e.g. test scores could be altered by other things like family background, quality of teachers etc and hence there will be regression residuals if such factors are not taken into account in the initial regression model (as u) ).


What can we use to estimate the unknown parameters in a regression model?

We can use OLS (Ordinary Least Squares) estimators to try and find out why there may be large residuals in our regression model.

This method involves minimising the sum of the differences in Yi and Y (or variations in the dependent variable from the regression line compared to residuals).


What are some implications that we can derive from the FOC's of OLS estimators?

- Residuals and sample covariance must equal zero. (to show no correlation)
- The mean of the predicted values (regression line) also passes through the mean of the actual values of Y(residuals).


What are two regression statistics that we can use to determine how well the data fits the regression line?

R squared and SER (standard error of regression).

R^2 represents how the variance of the independent variable affects the variance of the dependent variable - measured between 0 and 1, with 1 indicating a perfect fit. By taking out the independent variable in the calculation, we can see how much that affects the data point.

SER is a measure of the standard deviation of the error term (residuals) - it measures the spread of actual data points around regression line.


What are the five assumptions of simple linear regression?

- Linearity
- Random sampling
- Sample variance in independent variable
- Zero conditional mean of error term
- Homoschedascitiy and no autocorrelation
- (Normality) which is optional


Explain linearity assumption in simple linear regression?

Parameters and error term has to portray linearity. For example, if B1 was multiplied by an exponential, that would give us an ever-rising slope and therefore no linear relationship.


Explain random sampling assumption in simple linear regression?

Independent variable is known and non-random but sample is random, leading to unbiased estimates.


Explain sample variance in X assumption in simple linear regression?

Independent variable has to take on at least 2 different values.


Explain zero conditional mean of error assumption in simple linear regression?

Independent variable is completely uncorrelated to error term - this is found by calculating the mean of the error term (which should be zero).

Each Y term around and X term will form a perfectly symmetrical graph.


Explain homschedascity and no autocorrelation assumption in simple linear regression?

Conditional variance is same, regardless of X value and hence not correlated.


What is an outlier? How should we deal with it?

This is a data point that exists way beyond our regression line and all other points. We should exclude this point from our data and perform regression again.


When can we say that an estimator is unbiased?
When can we say that an OLS estimator is unbiased?

- When it's expected value equals the population parameter.

- When assumptions 1-4 hold true.


What can we say about large sample sizes and size of OLS estimator variance?

The larger the sample size and variance, the lower the variance of the OLS estimators.

You want all 5 assumptions to hold true for BLUE to occur. Assumption 5 also plays a part in normal distribution function.