Lesson 3 - Estimation of Tree Volume Flashcards
(10 cards)
What is linear regression
linear regression is used to relate on dependent variable (y) to independant variables (x’s)
What is simple linear regression and what is it used to find
only one independent variable, used to find estimates of the slope and intercept from simple data
once obtained we can:
- determine how well the regression line fits the sample data (goodness of fit)
- calculate confidence intervals for the true slope and intercept (population)
- calculate confidence intervals for the mean predicted y value (y value on the regression line)
- test whether the regression line is signifigant
What assumptions must be met for simple linear regression
- the relationship between x and y is linear
- the variance of the y values must be the same for every x value
- each observation (x,y) must be independent of all other observations
- the y value must be normally distributed for each x value
- the x value must be measured without error
- the y values are selected randomly for each x value
explain SSX and SPXY
SSX = sum of squares of x
= sum of x squared - (sum of x) squared/n
SSXY = sum of product xy
= sum of x*y - (sum of x)(sum of y)/n
What is b0 and b1, and how is SSX and SPXY related?
b0 and b1 are coefficients of the regression line
b1 is the slope
= SPXY/SSX
b0 is the y intercept
= mean of y - b1 * mean of x
finding these values will result in minimizing the sum of squared difference
What is goodness of fit and how can we describe it
How well the line actually explains (fits) the data
can be explained through coefficient of determination and standard error of the estimate
explain coefficient of determination
r squared
the amount of variation of the y observations accounted for by the regression, value will always be between 0 and 1
higher the value, higher the proportion of variation in y is accounted by the regression
What is standard error
gives us an indication of how far the observations are spread around the regression line
68% of the sample observations will be within one standard error, 95% will be within two standard errors of the regression line
what are confidence intervals
how we expect the population to behave
- how likely it is that if we took another sample set, we would obtain similar estimates of the slop and intercept
What is the standard deviation
how much the estimates would be expected to vary for different sample sets
- can also be calculated for the mean predicted value of y for a given value of x