Exam 2- Regression Flashcards Preview

Statistics > Exam 2- Regression > Flashcards

Flashcards in Exam 2- Regression Deck (64):
0

Statistical model

An equation that fits the pattern between a response variable and possible explanatory variables, accounting for deviations from the model. Or in other words, a regression line

1

^
Y=a+bx

Y-hat reminds us that we have deviations about the line and that values for y specified by the line are PREDICTIOnS
a - intercept
b - slope
^
Y- predicted value if y for a given x

2

What does y intercept tells us?

The value of y when x=0

3

What does slope tell us?

The change in y for every one unit increase in x , on average!

4

As x increases by one unit what happened to the y when slope is negative?

Y decreases

5

As x increases by one unit what happens to y when slope is positive?

Y increases by rise/run units

6

b=

Rise(y)/run(x)

7

Interpretation of slope : rise/run

For every inch increase in height at age 4 , height increases by 1.15 inches ON AVERAGE at age 18

8

Interpretation of y- intercept

Males who are zero inches tall at age 4 will be 23 inches tall at age 18

The intercept is the value of y when x=O

9

How to predict

- collect data
- plot data
- predict
- fit the data with a straight line equation
- evaluate the equation

10

Residuals

Vertical distance from the observed y value and the line , or

The difference between observed y value and y-hat , the value predicted by regression line

11

Squared Prediction error (residual)2

(Observed y - predicted y)2= (Y - Y(hat)) squared

They are squared because the sum of two residuals are normally equals to zero ( negative residual plus positive residuals above and below the line)

12

Positive residuals

Points above the line

13

Negative residuals

Points below the line

14

The least-squares residual line is

The line with the smallest sum of squares errors (denoted SSE)

15

Sum of Squared Deviations (residuals, errors (SSE) represents


The total variation in observed values of y
Sum residuals2( squared) =
( y - y-hat) squared

16

Least - squares equation

Y-hat=a +bx

17

Formula for a (intercept)

a=y-bar - bx(bar)
Where y and x are the respective means

18

Formula for b(slope)

Slope is a rate of change, the amount of change in y for a given value of x when x increases by 1

b=r Sy/Sx

19

Least-squares regressions line facts

-makes the distance of the data points from the line small Only in Y direction
- if we reverse the roles of two variables we get different least squared regression line

20

What is the connection between correlation r and the slope b of the least squared line?

Slope and r have the same sign
B=r only when Sy=Sx
Both r and b tell us the direction
If r=0 b =O
If ro b>0
If we know sign of r we know sign of b and vise versa

21

What b and r have in common

Always have the same sign

A change of 1 standard deviation in x corresponds to a change of r standard deviations in y.

Change in y(hat) is less then change in x

22

The least squares regression line always passes

Through the point (x bar;y bar)

23

Correlation r describes

The straight line relationship

24

The square of correlation r 2 gives us

The percentage % of Variation in the values of y that is explained by the least squares regression line

On the chart R-sq=0.6937 or 69.37%

25

Regression line

Is a straight line that describes how a response variable y changes as an explanatory variable x changes

26

Least squares line is a math model used to predict

The value of y for a given x

Y = a +bx

27

Least squares regression line requires that we have

Explanatory and response variables, quantitative

28

The least squares regression line of y on x is the line that makes

The sum of the squares of the vertical distance of the data points from line as small as possible

29

The least squares regression line as any line has

Slope and intercept
Chance of y into Yhat

Slope b =r(Sy/Sx) Where r is correlating factor and s are standard deviations for both x and y

30

When r2 is close to 0 zero the regression line

Is not a good model for the data ; hamburger shape , no relationship between x and y explained by regression line

31

When r2 is close to 1

The regression line should fit the data well or almost 100 % of variations in y are explained by x

32

The coefficient if determination r2

represents the fraction (%) of the variation in the values of y that is explained by the least squares regression of y on x.

33

Regression is a common statistical setting and least squared regression is most common method for

Fitting a regression line to data

34

Least squares regression line always passes through

The point x and y

35

Residual

Difference between an observed value of the response variable y and the value predicted by regression line y-hat
Residual = observed y - predicted y or y-hat

36

The residual show

How far the data is from the regression line and how well the line describes the data.

37

The mean of the least squared residuals is

Always zero!

38

A residual plot (diagnostic plot)

Is a scatter plot of the residuals versus the observed x values ( or y-hats ) which lay on the regression line

39

If the residual plot shows uniform scatter of the points about the fitted line

Above and below with no unusual observations or systematic pattern, then the regression line captures the overall relationship well

40

Residual plot - curved pattern

Relationship is not linear

41

Residual plot - megaphone

Increasing or decreasing spread about the line x indicates that prediction of y will be LESS accurate for larger x's

42

Individual points with large residuals are

Outliers in the vertical direction

43

Influential observation

Is an outlier in either x or y direction which if removed would markedly change the value of the slope and y- intercept

44

Outlier

An observation that lies outside the overall pattern of the other observations

45

Ecological correlation

A correlation based on group mean averages rather than on individuals .

46

Correlation measures

Direction and strength of linear relationship of quantitative variables x and t

47

Regression models

The linear relationship between x and y and can be used to predict a value for the response variable y for a specific value of the explanatory variable x

48

What is total variation?

Sum of squared deviations about y-bar

49

What is unexplained variations?

Sum of squared residuals or variations not explained by regression line

50

Regression assumptions:

The relationship between x and y can be modeled by a straight line ( residuals show randomness around the line)
Variations in Y's about the line does not depend on values if x ( residuals are similar in size for all X's)

51

If residuals conditions (assumptions) are met

Shoes box or There is no pattern in the residuals

52

Smile or frown pattern in residual plots indicate

Non-linear relationship - violation of conditions (assumptions)

53

Megaphone pattern in residual plot indicates

Non-constant variations ( variation in y is dependent on x)

54

Shoe box residual plot with a point outside indicates

Outlier in either x or y direction

55

An estimated statistical model-

Regression equation

56

Regression equation is an

Estimated statistical model

57

r2 is a measure of how

Successfully the regression explains the variation on the response, y

58

The sum of squared residuals measures ...... Variation

The unexplained

59

R-sq is a measure of the fraction of variation in y that is .... Not explained by X
R-sq = 1 - unexplained var/total var

Not explained by x

60

Residual plot help us to magnify the residuals and identify ..... Sometimes we can see ..... Observations and ...... Which are much more visible on the residual plot.

Problems.
Unusual observations
Patterns

61

A residual plot is a ..... Of the x-values plotted against the residuals

Scatterplot

62

Correlations based on ..... Rather then on ...... Can be misleading if they are interpreted to be about individuals

Averages.....on ondividuals

63

Removing influential point from the data set will change ...

Slope and y-intercept