Linear Regression Flashcards

1
Q

Linear regression

A

models the relationship between an independent (explanatory) variable $X$ and a (real-valued) dependent value $Y$.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Intercept of the line

A

this the extra independent variable of the linear equation
L = B_0 + B_1*x
B_0 is the intercept. Means without x then still would have B_0.
B_1 is the slope of the line, increase per unit added.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

R^2

A

measure how good linear regression is from 0 to 1. takes variance and error into account.
Says how much the model explains the variance
correlation squared

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

hypothesis testing

A

use standard error
Evaluate likelihood of obtaining as extreme of a model as computed
null hypothesis there is a linear relation
add a normally distributed variable to model
Look at the p values for the intercept and slope if < 0.05 (give proba of getting these values). If <0.05 proba 95% that value is contained in interval of it.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

standard error

A

the standard deviation of its sampling distribution or an estimate of that standard deviation.
the standard error of the mean is a measure of the dispersion of sample means around the population mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Residual

A

A residual is the vertical distance between a data point and the regression line. Each data point has one residual. They are positive if they are above the regression line and negative if they are below the regression line. In other words, the residual is the error that isn’t explained by the regression line.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Confounder

A

Can be significant simple linear regression (p<0.05) but not significant in multilinear (p>0.05)
In statistics, a confounder is a variable that influences both the dependent variable and independent variable, causing a spurious association. Independent variables are the explonatory variables X and the dependent is the predictor Y.
If 2 variables are correlated and used to predict Y then the accuracy might not be impacted but the coefficients associated with each variable might not be meaningful anymore.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Categorical variables

A

do dummy variables 0 and 1
Don’t do more as otherwise, it indicates an ordinal value which isn’t the case !
A categorical value can be added to the multilinear model if it is one: then it also changes the intercept !

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

non linear

A

can add a degree to a variable to introduce nonlinearity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Adjusted R^2

A

R^2 reduces if you add more and more predictors and there is not much gain

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Razor method

A

should choose simplest method if not much gain in higher d or more variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

bias-variance tradeoff

A

The bias of the method is the error caused by the simplifying assumptions built into the method.
The variance of the method is how much the model will change based on the sampled data.
The irreducible error is error in the data itself, so no model can capture this error.
There is a tradeoff between the bias and variance of a model. High-variance methods (e.g., the blue method) are accurate on the training set, but overfit noise in the data, so don’t generalized well to new data. High-bias models (e.g., the black method) are too simple to fit the data, but are better at generalizing to new test data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Generalize a model

A

cross-validation (k fold, split K times, train on K-1 and test on remainder, test different parameters and take model with the best average performance)

step-wise selection (forward selection: we start with one predictor, find the best model with only one predictor (based on a performance metric), move to models with two predictors (by keeping the one predictor fixed) etc.
backward selection: opposite as above, we start with a model with all predictors and reduce them one by one.
Not guaranteed to find the best selection)

regularization (Lasso and Ridge, Lasso drive some coef to 0 and thus do selection at the same time, Ridge penalize too much variables)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly