Summa Week 7 Flashcards

Question

What does SSE refer to?

Answer 1

the sum of Squared errors in the least squares line approach

Answer 2

only as good as the data given

Answer 3

total variation = explained variation + unexplained variation

Answer 4

Sum(Y-Y_)^2 = Sum(Y'-Y_)^2 + Sum(Y-Y')^2

Answer 5

the proportion of variance accounted for by the regression model

Answer 6

the sum of squares

Answer 7

the Pearson Correlation Coefficient Squared

Answer 8

r^2 = sum(Y'-Y_)^2/Sum(Y - Y_)^2 = Explained Variation / Total Variation

Answer 9

It's false! | The variability can be accounted for by the variability in X, but NOT necessarily caused by X

Answer 10

Y X causality

Answer 11

- using a boxplot | - determining using z-scores

Answer 12

SPSS: Graphs - Legacy Dialogs - Boxplot | right click on individual * (extreme outlier), and select "Clear"

Answer 13

SPSS: analyze - descriptives - descriptives and select the "Save standardized values as variables" option Eliminate cases with a z-scores +-3 SD from the mean

Answer 14

extreme outliers more than 3 SD away from the mean

Answer 15

they could influence the entire results of the study away from the estimated population parameters

Answer 16

the differences between the values of the outcome predicted by the model and the values of the outcome observed in the sample (extreme outliers)

Answer 17

influential cases, or extreme outliers

Answer 18

those with an absolute value of standardized residuals greater than 3

Answer 19

those that are divided by an ESTIMATE of their standard deviation

Answer 20

SPSS Casewise diagnostics

Answer 21

Areas under Distances and Influence statistics in the Linear REgression form of SPSS

Answer 22

hell naw. Who can say?

Answer 23

nahhhhhhh. there could be a third or fourth variable

Answer 24

Yeah, as long as the sample size is large enough

Answer 25

the residuals at each level of the predictor should have the same variance, but not as big of a deal if violated

Answer 26

the variance of Y for each value of X is constant in the population

Answer 27

in the population, the values of Y corresponding for any specified value of X are normally distributed around the predicted Y

Answer 28

spooled^2 = df1/dftotal (s1^2) + df2/dftotal (s2^2)

Answer 29

the predictor variable must be quantitative or categorical, and the outcome variable must be quantitative, continuous and unbounded

Answer 30

the predictor should have some variation in value

Answer 31

external variables are variables that haven't been included in the regression model which influence the outcome variable

Answer 32

10 or 15 cases per predictor variable

Answer 33

SPSS Graphs legacy dialogs, scatter/dot, simple scatter -- x-axis is the predictor, the y-axis is the outcome variable. Add the "line of best fit" to assist in checking linearity. If the scatterplot follows a linear pattern (versus a curvilinear pattern) then the assumption nis met

Answer 34

it needs to be a line, rather than a curvilinear pattern

Answer 35

using the Durbin-Watson test for serial correlations between errors

Answer 36

the test stat can vary between 0 and 4 with a value of 2 meaning that the residuals are uncorrelated

Answer 37

1 or greater than 3, however values closer to 2

Answer 38

visually inspect the normality through the Q-Q plot of the residuals statistically inspect the normality: conduct z tests on skew and kurtosis of the residuals

Answer 39

ZRESID, versus the standardized predicted values, ZPRED

Answer 40

MUST NOT; zero, scattered

Answer 41

a sequence or a vector of random variables is homoscedastic /ˌhoʊmoʊskəˈdæstɪk/ if all random variables in the sequence or vector have the same finite variance. This is also known as homogeneity of variance. The complementary notion is called heteroscedasticity

Answer 42

Analyze - Regression - Linear - predicted as DV and predictor as IV, OK

Answer 43

y-intercept = unstandardized coefficient/ B (constant)

Answer 44

b1 = unstandardied coefficient std. error for the 2nd line item (IV)

Answer 45

was under standardized coefficients Beta, at the IV row

Answer 46

the standardized coefficient for the predictor variable, or the percentage associated with

Answer 47

t = b - b*/sb

Answer 48

regression slope

Answer 49

sb = sYX / sx * square root of (N - 1)

Answer 50

b + - (t a/2) [(sY*X) / sx square root of (N - 1)], with df = N - 2

Answer 51

under the Model Summary, adjusted R squared Adjusted R^2 = .33, F(1,198) - 99.59, p < .001 (N = 200)

Answer 52

approximately (adjusted R^2) of the variance of the DV was accounted for by its linear relationship with the IV

Answer 53

total variability between scores and the mean (how the individual stats vary from the sample mean)

Answer 54

residual/error variability, between the regression model and the actual data (how the individual stats vary from the regression line)

Answer 55

model variability between the model and the mean (how the mean value of U differs from the regression line)

Answer 56

SS uses the differences between the observed data and the mean value of Y

Answer 57

the sums of squares that are total values

Answer 58

mean squares, MS

Summa Week 7 Flashcards

regression (88 cards)