Regression Flashcards
(25 cards)
Associations Between Variables:
Two ________ (something numerical) variables measured on the same cases are associated if knowing the value of one of the variables tells you something that you would not otherwise know about the value of the other variable.
quantitative
displays trends, patterns, associations, and relationships between two quantitative variables (descriptives)
scatterplot
(tells us the trend)
- Positive relationship: as x increases, y (increases/decreases)
- Negative relationship: as x increases, y (increases/decreases)
- increases
- decreases
- X-axis is the (independent/dependent) variable.
- Y-axis is the (independent/dependent) variable.
- independent (or predictor/explanatory)
- dependent (or response)
If dots are closer together on scatter plot = (stronger/weaker) relationship.
stronger
Which can be controlled or manipulated? Independent or dependent variable?
Independent variable
Which cannot be controlled or manipulated? Independent or dependent variable?
Dependent variable
the “results” or “predictions” based on the data from the explanatory variable.
response variable (y depends on the value of x)
striking deviations in a scatterplot
outliers
We describe the overall pattern of a scatterplot by _____, ______, and _____ of the relationship.
- direction (positive or negative)
- form (linear (straight-line) or non-linear)
- strength (between x and y variables)
(3 things that scatter plot tells us)
If scatter plot has dots in an exact straight line, this is a ________ relationship.
perfect (positive or negative) relationship
Linear correlation coefficient; the numerical measure to interpret the STRENGTH of the linear relationship between two quantitative variables
r
r value (correlation co-efficient) is always between _____ and _____.
-1 and 1
(sign only gives the direction of the relationship)
About r:
1. If r > 0…
2. r < 0…
3. r = 0
- positive relationship between x and y
- negative relationship between x and y
- no relationship between x and y
T or F: We say a linear relationship is strong if the points lie close to a straight line and weak if they are widely scattered about a line.
True
Values of r near 0 indicate a very ____ linear relationship; closer to 1 or -1 is a ______ linear relationship.
weak; strong (this is how r tells us the strength)
(When r = -1 and r = 1 then a perfect linear relationship)
Coefficient of Determination; the proportion of the variance in the dependent variable that is predictable from the independent variables
r^2 (tells us how good/useful our predictive model is)
Coefficient of Determination:
- gives ________ of fit of regression line
- How well the regression line can predict the ________ variable.
- 1-r^2 is the rest of the variance that is _________.
- is always between ____ and ____.
- goodness
- dependent
- unexplained
- 0 and 1
What is the prediction equation for regression line?
y hat = b0+b1(x)
- b0 = intercept
- b1 = slope
- x = # of something we’re going to predict (independent variable)
(look at camera roll for this equation written better)
the predicted value of the dependent variable (y) for any given value of x
y hat
On chart:
1. multiple r =
2. r square =
3. observations =
4. regression under df column =
5. residual under df column =
6. total under df column =
7. under coefficients column, first row
8. under coefficients column, second row
- r (correlation coefficient)
- r^2 (coefficient of determination)
- n (# of observations)
- k (# of independent variables)
- degrees of freedom
- total degrees of freedom
- b0 (intercept)
- b1 (slope: ALWAYS LOOK AT SLOPE SIGN = tells you if relationship is positive or negative)
(look in camera roll for pic of this; 2 pics total)
What is the formula to calculate degrees of freedom?
n - (k+1)
How does slope sign affect the value of correlation coefficient (r)?
If slope is negative, put a negative sign in front of r.
Simple linear regression =
if there is only one independent variable (if multiple, it is a multiple linear regression)