L18 - Regression 2 Flashcards

1
Q

what are the two types of models and how do they differ?

A

deterministic models have no randomness within them
probabilistic models have randomness
both describe the relationship between variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

describe what a deterministic model is and provide an example.

A

They hypothesise the exact relationship between two variables. they are suitable when prediction error is negligible.
- an example is a linear graph of distance and time to produce speed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Describe a Probabilistic model and provide an example.

A

probabilistic models hypothesise two components
- deterministic
- Random error
Example: Income varies among based on education level.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

within the line of best fit regression model what does B1 represent?

A

B1 estimates the amount of change in Y given by a change of 1 in X. (AKA the slope )

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

within the line of best fit regression model what does B0 represent?

A

B0 represents the value of Y given that the value of X is zero (AKA the Y-intercept)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what are the four cautions of regression?

A
  • Spurious relationships
  • Extrapolation
  • Generalisation
  • Causation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what is a Spurious correlation in Regressions? and why is it a problem?

A

This occurs when the mathematical relationship between the two variables are not actually directly linked and have no relationship.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what is Extrapolation in terms of a regression model? and why is it a problem?

A

Extrapolation occurs when inference about the line of best fit is made outside where the data-points lie. this can be a problem because the relationship between the variables could change with higher and lower values of X.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what is Generalisation in a regression model? why can it be a problem?

A

generalisation occurs when a conclusion is drawn from a small dataset and applied to a larger population. This can be a problem because the sample may not accurately represent the population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what is an Outlier? and why can it be a problem within a regression analysis ?

A

outliers are data points that are significantly different from the other data points within a sample. They can be a problem because they can skew the line of best fit and reduce correlation r^2 values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what is an Influential point within a dataset?

A

it is a point that significantly affects the line of best fit.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

how does the removal of an influential point within a dataset affect the line of best fit?

A

it changes the slope of the line (B1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Are all influential points outliers?

A

no

How well did you know this?
1
Not at all
2
3
4
5
Perfectly