Midterm Flashcards

Question

quantitative graphs

Answer 1

histogram stem and leaf plot dot plot box plot

Answer 2

prob of margins. int: proportion of all who are _________.

Answer 3

prob of inner over total int: the proportion of all who are ________ and ________.

Answer 4

prob of inner over small total int: the proportion of _______s who are _________.

Answer 5

Shape Outlier Center Spread + context

Answer 6

yes - median - IQR no - mean - range - standard deviation - variance

Answer 7

ignores all values in the data set except the max and the min non-resistant to outliers

Answer 8

how far, on average, the values of the distribution are from the mean - greater than 0 (s=0 => no variability) - large value = more variability - not resistant to outliers - measures variance about the mean int: the (context) differes by (s.d. unit) from mean (mean), on average.

Answer 9

+ -- can know how spread the data are - -- can't see peaks/gaps in the data

Answer 10

symmetric (no outliers) - center: mean - spread: standard deviation skewed (yes outliers) - center: median - spread: IQR

Answer 11

linear transformation --> if add/subtract - mean = add/subtract - standard deviation = same - shape = same --> if multiply/divide - mean = multiply/divide - standard deviation = multiply/divide - shape = same

Answer 12

a curve that - is always on or above the horizontal axis - has area exactly 1 underneath it - describe the overall pattern of a distribution

Answer 13

1. symmetric and bell shaped 2. the mean = median, both located at exact center

Answer 14

- 68-95-99.7 rule

Answer 15

Direction Unusual features Form Strength + context

Answer 16

measures strength and direction of a linear association - indicates direction by sign - both variables need to be quantitative - does not rely on units of measure - has no unit of measurement - does not measure form - only for linear relationship - not resistant measure of strength int: the linear association between (x-context) and (y-context) is (strength) and (direction)

Answer 17

using regression line to predict outside the x-values that were used to calculate the line

Answer 18

actual y - predicted y + = underprediction - = overprediction int: the actual (y-context) was (residual) (above/below) the predicted value when (x-context = #)

Answer 19

the line that makes the sum of the squared residuals as small as possible - distinction between x and y is essential - r and slope has same sign - not resistant to unusual point

Answer 20

a scatterplot that shows the residual as the y-value and the explanatory as x-value if it doesn't have pattern, the regression model is good.

Answer 21

measures the typical distance between the actual y and predicted y int: the actual (y-context) is typically about (s) away from the value predicted by the LSRL.

Answer 22

measures the percent reduction in the sum of squared residuals when using the LSRL to make prediction, instead of just using the mean of the y-values int: about (r^2 %) of the variation in (y-context) can be explained by the linear relationship with (x-context)

Answer 23

outlier - pt that does not follow the pattern of data (has large residual) high leverage - much larger/smaller x-values than other pts influential points - any pt that, if removed, substantially chances {slope, y-int, r, r^2, s}

Answer 24

int: if the random process of (context) is repeated many, many times, the average number of (x-context) we can expect is (expected value).

Answer 25

int: the (context) typically vary by (standard deviation) from the mean of (mean)

Answer 26

mean = mu x +/- mu y standard deviation = sqrt (mu x^2 + mu y^2) --> X and Y must be independent random variables

Midterm Flashcards

(50 cards)