Quantitative Methods - Correlation And Bivariate Linear Regression Flashcards
(41 cards)
What is more method to show the relationship, which isn’t hugely precise?
Place a ruler and draw a line, this works reasonably well if all the plotted points are roughly in a straight line
What is Regression?
The equation of the line of best fit mathematically, giving a much more accurate result that the visibly drawn line.
What is Correlation?
An indication of the accuracy or strength of the relationship, I.e. Whether this line is a good or poor explanation of the relationship between the two variables
What is the equation of a strait line on a graph?
Y = a + bx
A= the height at which the line cuts the y axis B= the slope, ie the change in the value of y unit per unit change in the value of x
However there are other notations for writing this.
What are the regression coefficients or parameters?
A and b or their equivalents
Typically, when dealing with a population what notation for a straight line do we use?
y = a+bx (both the b and a are big and weird)
Typically when dealing with a sample how will the notation for a straight line appear?
Y = a+bx
What is bivariate or simple linear regression?
Where a relationship exists between two variables, one of which will drive/determine the value of the other
In terms of regression, what does the independent variable allow?
If we can identify a value for the independent variable, we can use this to predict the value for the dependent variable
With regards regression, how will we predict the value of the dependent variable?
Interpolation or extrapolation from given expectations regarding the independent variable
What is the output of regression analysis?
The two coefficients a and b hitch determine the relationship between the two variables
What does the coefficient b mean in regression analysis?
The gradient of line. If this is 2 it means an increase on the x axis of 5 will see the y axis increase by 10.
If B is positive the line will be upward sloping suggesting y is directly proportional to the value of x
If B is negative the line will be downward sloping suggesting y is inversely proportional to the value of x.
What does the coefficient A mean in regression analysis?
This gives the value of Y when X is zero
The method for estimating the parameters A and B is referred to as what?
The Least Squares Method.
What is the least squares method?
The method of estimating the parameters a and b
What is the idea behind the Least Squares Method?
In calculating the regression line we want the line of best fit. Is implies that all the actual values on the graph on the graph are reasonably close to the line. It is usually not possible to make all the values very close at the same time, since we are making a straight line meaning some values will be further from the line than others!
What is minimising the sum of the squared errors ?
The vertical distances of the points away from the line are added and then squared to ensure we only consider the absolute value of any distance and not e direction as all the numbers are positive. It also ensure large distances from the line are avoided, since the squaring process exaggerates the larger distances.
How will the regression equation be often expressed and why is it expressed so?
Y = a + bx + e
The e is an error or disturbance term. If the regression line represents the best linear unbiased estimate of the relationship between x and y, the error term will have a random variable with a mean of zero. It reflects that we cannot guarantee the value of y will be as predicted. On a random basis, the actual value will lie both above and below the predicted value.
What is the purpose of the correlation measure?
To measure the strength of the relationship between two variables. In the context of regression analysis with only one independent variable, the correlation coefficient will give an indication of how accurately the regression line matches the observed values
What is the correlation coefficient?
A relative measure indicating how two variables move with respect to each other. It is also called the Pearsons correlation coefficient. When a correlation coefficient is calculated by reference to a sample of data, it is referred to by the symbol r. When calculated by reference to a population, it is referred to as P (pronounced rho)
It measures the direction and degree of linear association between the two variables. It will have a value between + and - 1 and the meaning of the correlation coefficient can be best understood by considering extremes
What is perfectly positive correlation?
A correlation coefficient of + 1. Two variables will move up and down together and in proportion.
What is non-perfect positive correlation?
A correlation coefficient between 0 and 1. The line will still go up just the points won’t all be on it
What is perfect negative correlation?
A correlation coefficient of -1
The two variables will move up and down in exact opposition and in proportion.
What is uncorrelated?
Correlation coefficient of 0
Two variables move independently of each other