Regression Analysis(Udemy Statistics for Data Science and Business Analysis Flashcards

1
Q

What is a linear regression?

A

A linear regression is a linear approximation of a causal relationship between two or more variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the basic 3 part process of linear regression?

A

1.Get sample data
2.Design a model that works for that sample
3.Make predictions for the whole population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the difference between regression and correlation?

A

Correlation is about the relationship between two variables, regression is about how one variable affects another variable, correlation doesn’t capture causality, and regression is based on cause and effect

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the sum of squares total?

A

SST, the squared differences between observed dependent variable and mean, measure of total variability of dataset

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Sum of Squares Regression

A

Sum of differences between predicted value and mean of dependent variable, the measure that defines how well your model fits data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Sum of squares error

A

Difference between observed and predicted value, smaller the error better estimation power of regression, also known as RSS, residual sum of squares

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is R2?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What does R2 of 0 mean?

A

That your regression lines explains none of the variability of the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What does R2 of 1 mean?

A

That your regression lines explain all of the variability of the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is a good R2?

A

Physics/chemistry = between 0.7-0.99, but in social sciences 0.2 could be fantastic, depends on complexity of topic how many variables are believed to be in play

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the OLS?

A

The ordinary squares line, it is the line through the data with the least error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What’s generally better, multiple or simple regression? And why?

A

Multiple regression is generally better than simple ones, with each additional variable you have the explanatory power may only increase or stay the same

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What does the F Test Do?

A

The F test tests the overall significance of the model, the lower the F statistic, the closer to a non significant model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly