Regression Analysis Flashcards

1
Q

What is regression analysis?

A

Regression analysis uses a mathematical model to predict a variable (y) based on values of other variables (x1, x2, … xk). It is the process of finding the mathematical model that relates y to a set of independent variables and best fits the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a dependent variable?

A

A dependent variable (y), aka response variable, is the variable to be modelled and/or predicted.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is an independent variable?

A

An independent variable (x1, x2, … xk) are variables that are used to predict the response variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the equation for a probabilistic model?

A

y = E(y) + ε

Where E(y) = mean of y (i.e. the expected value of y)

Where ε = some random error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

A probabilistic model is based on the theory of probability or the fact that _______________ plays a role in predicting future events.

A

randomness

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the difference between a probabilistic model and a deterministic model?

A

A probabilistic model is based on the fact that randomness plays a role in predicting future events.

A deterministic model is the opposite of random - it tells us something can be predicted exactly, without the added complication of randomness.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

There are 7 major steps in regression analysis.

What are they?

(Hint: H, C, U, E, U, V, I)

A
  1. Hypothesize the form of the model for the E(y) - expected value of y.
  2. Collect the sample data.
  3. Estimate the unknown parameters in the model using the sample data.
  4. Specify the probability distribution of ε (random error) and estimate any unknown parameters.
  5. Statistically check model adequacy.
  6. Check validity of the assumptions on the ransom error; Make modifications if necessary.
  7. Use the model for prediction and estimation.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

There are 6 steps in regression for a probabilistic model [y = E(y) + ε].

What are they?

(Hint: H, C, E, P, C, P)

A
  1. Hypothesize the form of the model for the E(y) - expected value of y.
  2. Collect the sample data.
  3. Estimate the unknown parameters in the model.
  4. Specify the probability distribution of ε.
  5. Statistically check model adequacy.
  6. Use the model for prediction and estimation.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

There are two types of regression data.

What are they?

A

Observational: where values of x are uncontrolled.

Experimental: where values of x are controlled via a designed experiment.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the difference between simple linear regression and multiple regression?

A

Simple Linear Regression involves a single independent variable.

Multiple Regression involves two or more independent variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How does this equation need to be modified to be considered a prediction equation?

E(y)=β01x12x23x1x24x125x22

A

We would need to update the entire equation to be a prediction equation for ŷ.

ŷ=B̂0+B̂1x1+B̂2x2+B̂3x1x2+B̂4x12+B̂5x22

where ŷ is the predicted value of y.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is missing from the equation if we are supposed to use it for probabilistic model?

E(y)=β01x12x23x1x24x125x22

A

Within a probabilistic model, we would need to make sure to incorporate the +ε factor. The equation would be updated to read as follows:

E(y)=β01x12x23x1x24x125x22

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Within the following mathematical equation for the deterministic model pictured, what do β012345 represent?

E(y)=β0+β1x1+β2x2+β3x1x2+β4x12+β5x22

A

β012345 are constants with values that would have to be estimated from the sample data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Within the following mathematical equation for the deterministic model pictured, what doe the E(y) represent?

E(y)01x12x23x1x24x125x22

A

E(y) repesents the mean percentage price increase for a set of values (x1 and x2).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the purpose of collecting sample data for regression analysis?

A

The purpose of collecting sample data is to estimate the unknown parameters of a regression model, (The β’s).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the difference between observational data and experimental data?

A

Observational data refers to independent variables that have not been controlled.

Experimental data refers to independent variables that have been set in advance (controlled).

17
Q

What are the two parameters in a probabilistic model?

A

The mean E(y) and the random error.

18
Q

This type of regression involves two or more independent variables

A

Multiple regression

19
Q

True or False:

Multiple regression involves two or more dependent variables.

A

False.

Multiple regression involves two or more independent variables.

20
Q

True or False:

Regression analysis is used to predict the value of a dependent variable from the value of an independent variable(s).

A

True