Chapter 7: Linear Regression Flashcards

Question 1

Q

Define ‘Linear model’.

Answer

A

An equation of the form
y-hat = bo + b1 x
where the x-variable is being used as an explanatory variable to help predict the response variable y. To interpret a linear model, we need to know the variables (along with their W’s) and their units.

Question 2

Q

Define ‘Model’.

Answer

A

An equation or formula that simplifies and represents reality.

Question 3

Q

Define ‘Predicted (fitted) value’.

Answer

A

The valye of y-hat found for a given x-value in the data. A predicted value is found by substituting the x-value in the regression equations. The predicted values are the values on the fitted line; the points (x, y-hat) all lie exactly on the fitted line.
The predicted values are found from the linear model that we fit:
y-hat = bo + b1 x.

Question 4

Q

Define ‘Residuals’.

Answer

A

The differences between the observed values of the response variable y and the corresponding values predicted by the regression model - or, more generally, values predicted by any model (y-hat).
Residual = Obs. y-value - Pred. y-value = y - y-hat.

Question 5

Q

Define ‘Regression line (Line of best fit)’.

Answer

A

The particular linear equation
y-hat = bo + b1 x
that satisfies the least squares criterion is called the least squares regression line. Casually, we often just call it the regression line, or the line of best fit.

Question 6

Q

Define ‘Least squares’.

Answer

A

The least squares criterion specifies the unique line that minimizes the variance of the residuals or, equivalently, the sum of the squared residuals.

Question 7

Q

Define ‘Slope’.

Answer

A

The slope, b1, gives a value in “y-units per x-units.” Changes of one unit in x are associated with changes in b1 units in predicted values of y.
The slope can be found by
b1 = r (sy / sx)

Question 8

Q

Define ‘Intercept’.

Answer

A

The intercept, bo, gives a starting value in y-units. It’s the y-hat-value when x is 0. You can find the intercept from bo = y-bar - b1 x-bar.

Question 9

Q

Define ‘Regression to the mean’.

Answer

A

Because the correlation is always less than 1.0 in magnitude, each predicted y-hat tends to be fewer standard deviations from its mean that its corresponding x was from its mean.

Question 10

Q

Define ‘Standard deviation of the residuals (se)’.

Answer

A

The standard deviation of the residuals is found by se = sqrt (∑e^2 / (n-2) ).
When the residuals are roughly Normally distributed (check their histogram), their sizes can be well described by using this standard deviation and the 68-95-99.7 Rule.

Question 11

Q

Define ‘Coefficient of determination R^2’.

Answer

A

The square of the correlation between y and x.

R^2 gives the fraction of the variability of y accounted for by the least squares linear regression on x.
R^2 is an overall measure of how successful the regression is in linearly relating y to x.

Question 12

Q

Understand the correlation coefficient as the number of ___ ___ by which one vairbale is expected to change for a one ___ ___ change in the other. (r is always less than 1 in magnitude, recall sign)

Answer

A

Standard deviation. Sections that discussed it.

Question 13

Q

Always check the ___ to check for violations of assumptions and conditions and to identify any outliers.

Answer

A

Residuals.

Any bends (Straight Enough Condition)?
Outliers?
Change in spread?

Question 14

Q

What are the conditions for regression?

Answer

A

Quantitative Variables Condition
Straight Enough
Does the Plot Thicken?
Outlier Condition

Chapter 7: Linear Regression Flashcards

(14 cards)