Regression Flashcards
(45 cards)
define regression analysis
regression analysis is the process of describing and evaluating the relationship between a given variable and multiple other variables.
Specifically, we always have one varaible, and we try to understand how this variable move as a result of movement in a set of other variables.
elaborate on correlation
correlation is a measure of tendency to move together. It does not imply causality. Often there will be a third-party cause that has an effect on both variables, which makes it look like movement in one of them cause the other one to move as well.
elaborate on how we treat the variabels in regression analysis
the dependent variable(s) is random variable subject to a probaiblity distribution.
The independent variables are assumed to have “fixed values in repeated samples”.
elaborate on the fixed regressor assumption
Same as “fixed in repeated samples”.
It is primarily a teaching thing.
It assumes that the values of the independent variables are the same all the time. Therefore, there is no uncertainty related to the sample process.
This means that the only uncertainty is related to the error term. The error term basically includes everything that we are not able to see, measure, and all that.
elaborate on the random disturbance term
The idea is that it is basically impossible to represent a linear relationship by an exact linear line. it is very likely that there are variables we have not accounted for, or we have measurement error or whatever.
Since we are trying to model a relationship that is not perfectly linear, we cannot use a perfect line to do it.
Therefore, we add a random disturbance term, u_t, which is specific to an observation. Doing this allows us to represent a line using the exact curve a+bx while accounting for differneces between the line and the various points.
y_t = a + bx + u_t
how do we determine alpha and beta?
minimizing the vertical distances between each sample point and the exact line.
why minimize vertical distances? Why not horizontal? Why not perpendicular?
we are assuming fixed regressors. Therefore, the task becomes minimizing the vertical distances.
perpendicular comes into play later, when we include error in the sampling process.
elaborate on OLS
minimizing the sum of square errors. Squaring errors will penalize outliers harder.
y_t is the observed value.
y_t “hatt” is the prediction from the line.
The residual, û_t represent the error (y_t - ^y_t)
what is RSS
Residual Sum of Squares
∑(y_t - ^y_t)^2
elaborate on how we derive the OLS estimators
differentiate the expression with regards to the estimators:
∑(y_t - ^y_t)^2 = ∑(y_t -â - ^bx_t)^2
Recall why this works: Convex loss function
elaborate on the reliability of the intercept
In many cases, all the observed datapoints are kind of far away from the intercept point. This means that we have no data close to the intercept, which would mean that we cannot treat such predictions.
The generalize to any area along the fitted line where we have missing data. We should be aware of the intervals of our data points, and thereby defining a sort of “operating range” of values that we should be fairly confident in.
elaborate on the PRF
PRF = Population Regression Function
It is the model that we “consider” to be the true data generating solution.
y_t = a + bx_t + u_t
the PRF represent the true relationship between the independent variables and the dependent variable.
elaborate on the SRF
Sample regression function.
It is the estimated population regression function.
the SRF has no error term.
^y_t = â + ^bx_t
elaborate on linearity we require
linearity in parameters (not necessarily variables).
This is the requirement to use OLS.
estimator vs estimtate
estimator is a function.
Estimate is an output from an estimator
elaborate on CLRM
Classical Linear Regression Model.
CRLM is the classical line y_t = a + bx_t + u_t
y_t depends on u_t. therefore, we must specify some assumptions on how this random disturbance term is generated.
E(u_t) = 0
var(u_t) = sigma^2 < infinity (constant and less than infinity variance)
cov(u_i, u_j) = 0 (independent of each other)
cov(u_i, x_i) = 0 (independent error and variable)
u_t = N(0, sigma^2) (normally distributed)
what happens if assumptions 1-4 hold?
BLUE
the estiamtors will have desireable properties:
1) unbiased
2) linear
3) estimators of the true value
4) estimators have the lowest variance amiong the class of linear unbiased estimators
how do we know that the estimators have the lowest variance in class?
Gauss-Markov theorem
elaborate on consistency
consistency refer to an estimator approaching the true value when the number of samples grow large, but not necessarily for small sample sizes.
what property do we say that consistency is?
An asymptotic property
are all unbiased estimators consistent?
Not necessarily. If the variance increase as the sample size increase, it will not be consistent. The probability of observing large differences between estimated value and true value is still relatively large.
elaborate a little on standard errors
Standard error is a precision measure that is used to indicate how reliable an estimate is.
It is found by taking the standard deviaiton of a statistic.
Note that the standard error doesnt tell us anything regarding the goodness of a specific estimate. It only tell us what we can expect in regards to an arbitrary estimate. A very low standard error means that we can consider the estimate to be generally of very good precision and so on.
Standard error is a function of x, sample variance, and the size of the sample
give the matrix form multivariable linear regression
y = Xb + u
y and u are vectors of size Tx1.
X is a matrix of size TxK
b is a vector of size Kx1
for the general linear regression model, how do we represent the standard errors of the parameters?
we use the formula: