Unit 2 Vocab Flashcards
(16 cards)
Scatterplot
A scatterplot shows the relationship between two quantitative variables measured on the
same cases. In a timeplot the horizontal axis is time. (p. 148)
Direction:
A positive direction or association means that, in general, as one variable
increases, so does the other. When increases in one variable generally correspond to
decreases in the other, the association is negative.
Form:
The form we care about most is linear (straight), but you should certainly
describe other patterns you see in scatterplots.
Strength
A scatterplot is said to show a strong association if there is little scatter
around the underlying relationship. (p. 149)
Outlier
A point that does not fit the overall pattern seen in the scatterplot. (p. 149)
Response variable, Explanatory variable, x-variable, y-variable
In a scatterplot, you must choose a role for each variable. Assign to the y-axis the variable that you hope to predict or explain. Assign to the x-axis the variable that accounts for, explains, predicts, or is otherwise associated with
the y-variable. (p. 150)
Correlation coefficient
A numerical measure of the direction and strength of a linear
association. (p. 153)
Lurking variable
A variable other than x and y that simultaneously affects both variables, accounting for
the association between the two. (p. 160)
Model
An equation or formula that simplifies and represents reality. (p. 173)
Linear Model
An equation of a line. To interpret one, we need to know the
variables (along with their W’s) and their units. (p. 173)
Predicted Value
The value of y-hat found for a given x-value in the data. A value is found by substi-tuting the x-value in the regression equation. The values are the values on the
fitted line; the points x, y-hat all lie exactly on the fitted line. (p. 173)
Residuals
The differences between data values and the corresponding values predicted
by the regression model—or, more generally, values predicted by any model. (p. 173)
Observed value-predicted value=e=y-yhat
Least squares
specifies the unique line that minimizes the variance of the
residuals or, equivalently, the sum of the squared residuals. (p. 174)
Regression to the mean
Because the correlation is always less than 1.0 in magnitude, each predicted yn tends to
be fewer standard deviations from its mean than its corresponding x was from its mean.
Regression line
yn =b0+b1x
that satisfies the least squares criterion. Also known as the line of best fit.
Slope
b1, gives a value in “y-units per x-unit.” Changes of one unit in x are associated with changes of b1 units in predicted values of y. Can be found by
rsy/sx
. (p. 176)