Simple Linear Regression Flashcards
(81 cards)
What is the goal of regression?
To predict Y (outcome variable) from X (predictor).
Which variable is fixed in a regression equation?
X is a fixed variable and Y is always the random variable.
T or F: there is no sampling error involved Y.
Why or why not?
F. There is no sampling error involved in X because X is a fixed variable, while Y is a random variable.
Concerning population parameters to go along with sample statistics in a simple linear regression, what do we predict Y from?
We predict outcome Y from beta naught (intercept) and beta 1 is (slope), multiplied by the predictor. Also the epsilon, or the residual (e) is the population error term.
What is our purpose of the modeling error?
Our purpose is the find the line that best summarizes the line between X and Y.
What is error called for population and for sample ?
What is model error?
Epsilon for population, e or residual for sample.
It is the difference between the people that deviate from the model line.
Define sampling error.
The difference between a population parameter and sample statistics.
What is goal in simple linear regression, in regards to the line?
Our goal is to be able to find the best fit of line.
We are trying to find from all possible lines, which one will result in the least amount of difference between the observed data and line.
How do we use the regression line to predict values?
We fit a statistical model to the data in the form of a straight line. This line is the line that BEST FITS the pattern of data.
What does y-hat indicate?
The line itself, to mark that it’s different than the situation where we have error is notated as y-hat.
Which contains error: The line or the model?
The model contains error.
Why is Y-hat considered a predictive Y?
What does this have to do with residuals?
Y-hat is a predictive because it signifies the Y-values that are predicted from the line
The difference between what’s predicted from the line and the observed value (Y) is the residual.
What is y-hat’s equation?
Y-hat = b0 + b1x
B0 = intercept B1 = Slope X = predictor value
How to we compute a simple linear regression on r? What does it produce?
rcorr(as.matrix(dataset))
Produces n and p value
What information do we need to create a regression equation?
We need to fill in the intercept (b0) and slope(b1) - so we need to determine the line of best fit.
How is regression conceptually similar to ANOVA?
With an ANOVA, we compared MSbetween and MS within- we want to minimize MSwithin (error), and we want to do the same with regression by making error as small as possible.
Before, we wanted to see how points deviated from the mean, but now we want to see how each point deviates from the regression line.
Why do we create a sum of squares for a simple linear regression equation?
We want to minimize the sum of the squared residuals (OLS solution).
Each point from the line gives residual, and we add them up. The problem arises because the distance of the points above the lines are the same as if they were if we added up below the line, and it adds to zero. So we must square.
What are the similarities and differences between a correlation and the simple linear regression?
If there’s just 1 predictor, we see a lot of similarities. The only thing that changes is how we treat the variables (prediction vs. description).
After we run the rcorr function on r and we see a significant result of a predictor value, what do we do next?
Since it’s highly correlated, we can predict the direction of that relationship.
Why do we compute the Ordinary least squares (OLS) solution? What is the criterion to be minimized in OLS?
We compute the OLS solution because it’s an estimation procedure done for regression where we minimize the sum of the squared residuals.
As long as we can put numbers to b0 and b1, what are all the information we can find?
Provide and example if b0 = 1 and b1 = 4.
If I have b0 equal to 1 and b1 equal to 4. I could fill that in into my regression and I would have a slope and residuals… I could fill in 1 + 4x… fill in all of my x values and get my y hat values from that and also the difference between y and y-hat to obtain the residuals… if I have my residuals, then I can do my sum of squared residuals…
Explain what derivative is and why we use the function of a derivative.
The derivative is when we find a particular location at which we can draw a line right next to the curve and get a slope, because we can never get a slope of a curve but we can find the lines that are straight that touches our curve at exactly 1 point away and this slope that is tangent to the curve will tell us how good our model is.
When we get to the minimum of sum of squared residuals function after a variety of guesses the computer makes, we will ultimately end up at the minimum of the function - the way we know we are at the minimum of the function is because where the line is tangent to the curve has a slope of zero.
This happens when we take the function and set the derivative to zero - and by the magic of calculus, we will have an equation for b0 and b1.
Why is SSy (sum of squares y) missing in the bivariate information in SL regression?
Compare this to correlation equation.
In correlation equation, we divided SSCPxy over the sqrt of SSx and SSy because we were interested in how our variables related to each other after removing the independence about each other those variables.
When I’m interested in predicting Y, I don’t care about how Y varies with itself - only about how Y varies along with X. We aren’t interested in the univariate information of Y, just how the 2 variables relate to each other after removing what is unique to X, so I’m left with Y and everything that’s shared with Y.
Conceptually, what is b1 and b2?
What is the equation for both?
b1 (slope) tells me for every 1 unit increase in X, how much Y changes. It tells me the change in Y based on changes in X.
b1 = SSCPxy / SSx
b0 (intercept) is mean of Y minus b1 times the predictor.
To obtain b0, we compute the slope first; then we multiply it by the mean of X.
b0 = Y-bar - b1(X-bar)