Descriptive Analysis and Linear Regression Flashcards
Linear Regression Model
Yi = B1 + B2X2i + BkXki + ui Yi = dependent variable Xi = explanatory/independent/regressor B1 = intercept/constant (average value of Y when X=0 B2 = slope coefficient
ui
stochastic error term
average effect of all unobserved variables
objective of regression analysis
estimate values of Bs based on sample data
OLS
Ordinary Least Squares - used to estimate regression coefficients
finds the pair of B1 and B2 (b1 and b2) that minimise RSS
OLS assumptions
- LRM is linear in its parameters
- regressors = fixed/non-stochastic
- exogeneity - expected value of error term = 0 given values of X
- homoscedasticity - constant variance of each u given values of X
- no multicollinearity - no linear relationship between regressions
- u follows normal distribution
OLS estimators are BLUE
best linear unbiased estimators
- estimators are linear functions of Y
- on average they are = to the true parameter values
- they have minimum variance i.e. efficient
standard deviation of error term =
standard error
= RSS/df
n-k
degrees of freedom
n = sample size
k = no. of regressors
hypothesis testing
construct Ho and Ha e.g B2 = 0 and B2 x 0
t = b2/se(b2)
if t > cv from table
reject null
type 1 error
incorrect rejection of true null
detecting an affect that is not present
type 2 error
failure to reject false null
failing to detect present effect
low p-value
suggests that estimated coefficient if statistically significance
p-value < 0.01, 0.05, 0.1
statistically significant at 1%, 5%, 10% levels
dummy variables
0 = absence 1 = presence
e.g 1 if female, 0 if male
B2 would measure changes when you go from male to female
b1 = estimated wage for men
b2 = estimated diff btw men and women
b1+b2 = estimated wage for women
if exogeneity assumption doesn’t hold
leads to bias estimates and therefore we need to adjust for omitted variables