this Flashcards
(17 cards)
What is regeression
A statistical measure that determines the stregth of the relationship between a DV and a set of IVs
regression has predictive power unlike correlation - allows to forecast value or change in the DV dependent on the changes in the IV or IVs
IVs regarded as inputs in the process and can take any value freely - known as predictor variable (X)
DVs are values that change as a consequence of the changes in the other values within the process - known as the outcome or repsonse (Y)
What is linear regression
The simplest mathematical relationship between two variables, X and Y, is a linear relationship
What is simple linear regression
The relationship between X and Y is represented with a line
There is only 1 DV and IV
Basis of regression is Pearsons R - without a linear relationship, cannot run regression
relationship is casual, thus main intention is predict
note: causality is demonstrated through logic, not just through statistic
What does it mean if regression is a straight line
A linear relationship between 2 variables is a straight-line relationship
Each time X changes by one unit, there is a constant change in variable Y - change called slope
When variable X is zero (0), or the line intercepts the Y axis (crosses vertical line), Variable Y can only have a constant value - constant is called Y-intercept
Explain Y = bX + a
Y = score on the Y variable
b = slope (change of Y/Change of X) or Sum of Predictions/Sum of Squares
X = the score on the X variable
a = the Y-intercept (a is value of Y when X is 0
How do we determine best fit line
Best fit line falls closest to all points in a scatter plot
Uses the best fit line or regression line
the line that follows the least squares criterion - Minimizes the value of the sum of squares differences of every predicted Y and actual Y. Minimizes error
What does the regression line also predict
Predicts a value of Y predicted for each value of X
Each Y’ is in error comparison with the Y score actually obtained
Difference is the error prediction
Summation of Predicted Y and Actual Y is always zero
Need to square it for value which means
The line that minimizes the value of - is the least squares regression line
What do you call the space in between Predicted Y and actual Y
Difference are called residuals or error
Explain the least squares regression line
Always passes the means of X and Y
For extreme values of Y, the Y predicted values are closer to the value of mean of Y (Y’ values tend to move or regress toward the mean of Y)
As we try determine value of Y’ from values of X, we now have a new equation for line called regression line equation
Explain Y’ = bX + a
Regression line
Y’ - The predicted Y
b = the slope
X = any score on the X variable
a = the Y intercept
we will be able to predict Y’ scores given values of X
What are requirements for simple linear regression
A straight line or linear relationship
Both X and Y must be measured at Interval level
Sample members must be random sample
Both X and Y variables must be normally distributed
sample size must be at least 30 for normality violations to be disregarded
What is accuracy in simple linear regression
Residual sum of squares
SSerror =
Squared differences of predicted Y (Y’) from actual Y
We want this to be a relatively small value
The larger the residual SS, the larger the error in prediction
What is the accuracy in simple linear regression
Regression sum of squares
- SSreg =
- Squared difference of predict Y (Y’) from the mean of Y
- We want this to be a relatively small value
- the larger the regression SS, the more different the regression line is from the mean of Y
What is accuracy in simple linear regression
R squared
The proportion of variance in Y (DV) that can be explained by X (IV)
Same process as Pearson r/correlation, just obtain the r and then square it
What are steps in solving correlation and Simple linear regression
- State the hypothesis
Ho - regression line is flat = implies that there is no change in Y for every change in X; Preceded by a non-significant correlation relationship
Ha - regression line is not flat = implies that there is a change in Y for every change in X
- Set level of significance - either .05 or .01
- Computer for statistic
- Make the decision - Robt is larger than Rcrit = Reject Ho
Robt is smaller than Rcrit = fail to reject Ho - nonsignificant relationship
How do we interpret slope
For every 1 point change in X, there is a corresponding unit change in Y
ex after solving for predicted Y (Y’)
when comparing children with multiple siblings, it is likely that for everyone one sibling increase, the happiness rating scale goes up by .98 points
How do we interpret regression
Similar as correlation for interpreting slope
.00 - .29 = weak
.30 - .69 = moderate
.70 - 1.00 = strong
mention r squared - ___% of the change in Y can be explained by X
ex when r squared is 0.3481
34.81% of the change in happiness rating (Y) can be explained by the change in number of sibling (X)