# Quantitative Methods - Correlation And Bivariate Linear Regression Flashcards

1
Q

What is more method to show the relationship, which isn’t hugely precise?

A

Place a ruler and draw a line, this works reasonably well if all the plotted points are roughly in a straight line

2
Q

What is Regression?

A

The equation of the line of best fit mathematically, giving a much more accurate result that the visibly drawn line.

3
Q

What is Correlation?

A

An indication of the accuracy or strength of the relationship, I.e. Whether this line is a good or poor explanation of the relationship between the two variables

4
Q

What is the equation of a strait line on a graph?

A

Y = a + bx

```A= the height at which the line cuts the y axis
B= the slope, ie the change in the value of y unit per unit change in the value of x```

However there are other notations for writing this.

5
Q

What are the regression coefficients or parameters?

A

A and b or their equivalents

6
Q

Typically, when dealing with a population what notation for a straight line do we use?

A

y = a+bx (both the b and a are big and weird)

7
Q

Typically when dealing with a sample how will the notation for a straight line appear?

A

Y = a+bx

8
Q

What is bivariate or simple linear regression?

A

Where a relationship exists between two variables, one of which will drive/determine the value of the other

9
Q

In terms of regression, what does the independent variable allow?

A

If we can identify a value for the independent variable, we can use this to predict the value for the dependent variable

10
Q

With regards regression, how will we predict the value of the dependent variable?

A

Interpolation or extrapolation from given expectations regarding the independent variable

11
Q

What is the output of regression analysis?

A

The two coefficients a and b hitch determine the relationship between the two variables

12
Q

What does the coefficient b mean in regression analysis?

A

The gradient of line. If this is 2 it means an increase on the x axis of 5 will see the y axis increase by 10.
If B is positive the line will be upward sloping suggesting y is directly proportional to the value of x
If B is negative the line will be downward sloping suggesting y is inversely proportional to the value of x.

13
Q

What does the coefficient A mean in regression analysis?

A

This gives the value of Y when X is zero

14
Q

The method for estimating the parameters A and B is referred to as what?

A

The Least Squares Method.

15
Q

What is the least squares method?

A

The method of estimating the parameters a and b

16
Q

What is the idea behind the Least Squares Method?

A

In calculating the regression line we want the line of best fit. Is implies that all the actual values on the graph on the graph are reasonably close to the line. It is usually not possible to make all the values very close at the same time, since we are making a straight line meaning some values will be further from the line than others!

17
Q

What is minimising the sum of the squared errors ?

A

The vertical distances of the points away from the line are added and then squared to ensure we only consider the absolute value of any distance and not e direction as all the numbers are positive. It also ensure large distances from the line are avoided, since the squaring process exaggerates the larger distances.

18
Q

How will the regression equation be often expressed and why is it expressed so?

A

Y = a + bx + e

The e is an error or disturbance term. If the regression line represents the best linear unbiased estimate of the relationship between x and y, the error term will have a random variable with a mean of zero. It reflects that we cannot guarantee the value of y will be as predicted. On a random basis, the actual value will lie both above and below the predicted value.

19
Q

What is the purpose of the correlation measure?

A

To measure the strength of the relationship between two variables. In the context of regression analysis with only one independent variable, the correlation coefficient will give an indication of how accurately the regression line matches the observed values

20
Q

What is the correlation coefficient?

A

A relative measure indicating how two variables move with respect to each other. It is also called the Pearsons correlation coefficient. When a correlation coefficient is calculated by reference to a sample of data, it is referred to by the symbol r. When calculated by reference to a population, it is referred to as P (pronounced rho)

It measures the direction and degree of linear association between the two variables. It will have a value between + and - 1 and the meaning of the correlation coefficient can be best understood by considering extremes

21
Q

What is perfectly positive correlation?

A

A correlation coefficient of + 1. Two variables will move up and down together and in proportion.

22
Q

What is non-perfect positive correlation?

A

A correlation coefficient between 0 and 1. The line will still go up just the points won’t all be on it

23
Q

What is perfect negative correlation?

A

A correlation coefficient of -1

The two variables will move up and down in exact opposition and in proportion.

24
Q

What is uncorrelated?

A

Correlation coefficient of 0

Two variables move independently of each other

25
Q

Generally what is viewed as a moderate correlation?

A

0.5

26
Q

What is viewed as weak correlation?

A

Less than 0.5

27
Q

What is viewed as strong correlation?

A

Above 0.5

28
Q

What are the two relevant factors in a correlation coefficient?

A

The sign (+ or -) and the value

29
Q

What are the effects of the two relevant factors in the correlation coefficient?

A

Positive correlation implies that variables move up and down together, negative correlation means that they move in opposition.

The value of the figure, ignoring the sign gives an indication of the strength of the relationship, the closer to one, the stronger the relationship

30
Q

What is the link between the correlation coefficient and the b coefficient in the regression equation?

A

If the correlation coefficient is positive/negative then coefficient b will also be positive/negative. If there is no correlation B will be zero

31
Q

What is significant about the B coefficient compared to the correlation coefficient?

A

The B coefficient gives more info than the correlation coefficient since it not only shows the direction of the relationship, it also shows by how much the dependent variable will change. However us of the B coefficient involves the limiting assumption that the dependent variable is a function of the independent variable. It is not necessary to designate a dependent and independent variable when calculating the correlation coefficient.

32
Q

What is a spurious relationship?

A

When two things may appear related but actually aren’t such as wasp stings and ice cream production.

33
Q

A

A regression relationship and high correlation in themselves do not love causality.

34
Q

What is data mining?

A

The process of economists sorting through data looking for info that may be relevant, but may actually be spurious relationship.

35
Q

What can the regression line be used for?

A

To estimate a value for y given a value for x

36
Q

What is interpolation?

A

Where we try to estimate profits that will arise from sales within the already observed range

37
Q

What is extrapolation?

A

Where we try to predict profits that will arise beyond (either above or below) the observed range of sales

38
Q

What is the validity of interpolation and extrapolation?

A

Estimating profits levels arising from sales within an item range, should provide reasonable results since there is data to support our conclusions

Estimating profits arising outside the range could be dangerous since we have no data to support the results we will be estimating. It is quite possible that a relationship that appears linear over a short range of values turns out to be non-linear when we move beyond those values, making any extrapolated results worthless or even dangerous.

However this data can be used to establish an expected value or forecast, and use the variability of the data to try to establish confidence limits within which we can be sure, with a given degree of confidence, that the observed value will lie.

39
Q

What are some issues with regression and correlation analysis?

A

They work for linear relationships, however many business and economic relationships do not follow linear trends and have much more complicated relationships.

40
Q

What is multiple regression?

A

Where the y axis may be dependent on more than one independent variable.

41
Q

What do regression and correlation analysis do?

A

Enabled us to quantify relationships.
The purpose of this enables us to estimate a specific point on one access if we know the value of the corresponding point on the other access