Descriptive Analysis and Linear Regression Flashcards

Question 1

Q

Linear Regression Model

Answer

A

Yi = B1 + B2X2i + BkXki + ui
Yi = dependent variable
Xi = explanatory/independent/regressor
B1 = intercept/constant (average value of Y when X=0
B2 = slope coefficient

Question 2

Q

ui

Answer

A

stochastic error term

average effect of all unobserved variables

Question 3

Q

objective of regression analysis

Answer

A

estimate values of Bs based on sample data

Question 4

Q

OLS

Answer

A

Ordinary Least Squares - used to estimate regression coefficients
finds the pair of B1 and B2 (b1 and b2) that minimise RSS

Question 5

Q

OLS assumptions

Answer

A

LRM is linear in its parameters
regressors = fixed/non-stochastic
exogeneity - expected value of error term = 0 given values of X
homoscedasticity - constant variance of each u given values of X
no multicollinearity - no linear relationship between regressions
u follows normal distribution

Question 6

Q

OLS estimators are BLUE

Answer

A

best linear unbiased estimators

estimators are linear functions of Y
on average they are = to the true parameter values
they have minimum variance i.e. efficient

Question 7

Q

standard deviation of error term =

Answer

A

standard error

= RSS/df

Question 8

Q

n-k

Answer

A

degrees of freedom
n = sample size
k = no. of regressors

Question 9

Q

hypothesis testing

Answer

A

construct Ho and Ha e.g B2 = 0 and B2 x 0

t = b2/se(b2)

Question 10

Q

if t > cv from table

Answer

A

reject null

Question 11

Q

type 1 error

Answer

A

incorrect rejection of true null

detecting an affect that is not present

Question 12

Q

type 2 error

Answer

A

failure to reject false null

failing to detect present effect

Question 13

Q

low p-value

Answer

A

suggests that estimated coefficient if statistically significance

Question 14

Q

p-value < 0.01, 0.05, 0.1

Answer

A

statistically significant at 1%, 5%, 10% levels

Question 15

Q

dummy variables

Answer

A

0 = absence
1 = presence

Question 16

Q

e.g 1 if female, 0 if male

Answer

A

B2 would measure changes when you go from male to female
b1 = estimated wage for men
b2 = estimated diff btw men and women
b1+b2 = estimated wage for women

Question 17

Q

if exogeneity assumption doesn’t hold

Answer

A

leads to bias estimates and therefore we need to adjust for omitted variables

Question 18

Q

quadratic terms

Answer

A

capture increasing/decreasing marginal effects

have to generate a new variable and add it to regression

Question 19

Q

marginal effect

Answer

A

first derivative of regression functioned wrt variable of interest

Question 20

Q

interaction variable

Answer

A

constructed by multiplying two regressors

allows the magnitude of the effect X has on Y to vary depending on the level of another X

Question 21

Q

interpreting

Answer

A

how does the regression function respond to a change in a variable

Question 22

Q

if it is not linear (log-log)

Answer

A

log-log model so that it is linear in parameters

take logs and add error term

Question 23

Q

log-lin model

Answer

A

dependent variable in logs – %
explanatory variables in levels – units
B2 measures relative change in output Q for an absolute change in input

Question 24

Q

lin-log model

Answer

A

estimates % growth in dependent variable for an absolute change in explanatory variable

Question 25

Q

lin-lin model

Answer

A

using a linear production function

Question 26

Q

testing for linear combinations

Answer

A

se – t-stat – compare to critical value – create p-value – reject/don’t reject null

Question 27

Q

TSS

Answer

A

total sum of squares = ESS + RSS sum of squared deviations from the sample mean = how well we could predict outcome w/o any regressors

Question 28

Q

ESS

Answer

A

explained sum of squares = how much of that variation do our regressors predict

Question 29

Q

RSS

Answer

A

residual sum of squares = outcome variation that regressors don’t explain

Question 30

Q

R^2

Answer

A

ESS/TSS
overall measure of goodness-of-fit of the estimated regression line
how much of variation is explained by regressors
increases when u add more regressors

Question 31

Q

F-stat

Answer

A

tests significance of all coeffs
(ESS/k-1) / (RSS/n-k)
>critical value =reject null

Question 32

Q

dummy variable trap

Answer

A

situation of multicollinearity

to distinguish btw m categories we can only have m-1 dummies

Question 33

Q

perfect collinearity

Answer

A

perfect linear relationship between two or more regressors

one predictor variable can be used to predict another

Question 34

Q

imperfect collinearity

Answer

A

one dependent variable always equals to a linear combination of the other dependent variables plus a small error term

Question 35

Q

consequences of multicollinearity in the data

Answer

A

larger standard errors – smaller t-ratio – wider CI – less likely to reject null

Question 36

Q

homoscedasticty

Answer

A

assumption that error term has has the same variance for all observations (doesn’t always hold)

Question 37

Q

heteroscedasticity

Answer

A

error terms have unequal variances for different observations

Question 38

Q

consequences of heteroscedasticity

Answer

A

OLS still consistent and unbiased
se either too large or too small so t-stats, F-stats, p-values etc will be wrong
OLS no longer efficient

Question 39

Q

dealing with heteroscedasticity

Answer

A

use log transformation
keep using OLS and compute heteroscedasticty
weighted least squares

Question 40

Q

using a logarithmic transformation of the outcome variable

Answer

A

e.g. ln(wage) - these variables tend to have more variance at higher values

Question 41

Q

continuing to use OLS and computing heteroscedasticity - robust standard errors

Answer

A

regress y on x

corrects se to allow for heteroscedastcity

Question 42

Q

weighted least squares

Answer

A

more efficient than OLS in presences of heteroscedastcity

Question 43

Q

omission of relevant variables

Answer

A

they’ll be captured by the error term

if they are correlated to the ones included then parameters are biased and exogeneity assumption doesn’t hold

Brainscape's Knowledge GenomeTM

Descriptive Analysis and Linear Regression Flashcards

Brainscape's Knowledge Genome^TM