BUSI344 / CHAPTER 6 QUESTIONS 2 Flashcards

Question

**F-VALUE INFORMATION**

Answer 1

**The F-Statistic (F-value or F-ratio) also provides information as to the "goodness" of the regression.**

Answer 2

**The F-Statistic shows the overall quality of the regression, as opposed to the usefulness of the individual variables as reported by the t-statistic.**

Answer 3

**The F-value is related to the correlation coefficient (r). It measures whether the overall regression relationship is significant; that is, it tests whether the model is useful in representing the sample data. The F-value is a ratio showing the portion of the total variation of the dependent variable that is explained by the regression divided by the remaining variation that is left unexplained by the model.**

Answer 4

**If explained variation is small relative to unexplained variation, the regression equation does not fit the data well and the regression results are not considered statistically significant. A small value of F (generally less than 4) leads to acceptance of the hypothesis that the regression relationship is not significant.**

Answer 5

**For a rough rule-of-thumb, modelers often use a critical level of F-statistic \> 4 to indicate a statistically significant relationship.**

Answer 6

**Continuing with our example, the F-ratio of 171.422 is quite a bit larger than 4. This indicates that the estimates produced by the regression model provide a better representation of the sample data than the mean of the observations.** **In other words, the regression estimates fit the data well and the results are statistically significant. The size of the F-ratio above the critical value of 4 must be viewed with caution. At larger magnitudes, the F-ratio is useful mostly as a relative measure; for example, if two models are identical in all respects other than their F-ratios, the model with the larger F-ratio is probably the better one. The absolute measure of the F-ratio is less meaningful because F-ratios are sensitive to the number of observations and the number of variables in the model. Few observations, together with a relatively large number of variables, will generally produce a low F-ratio.** **The large F-ratio in our example is greater than the critical value of 4 and indicates that the estimates produced by the model are better predictors of value than the mean. However, the large number of observations and few variables in the model would be expected to produce a very high F-ratio.**

Answer 7

**When checking the regression output, the following points are important:** * **the coefficients have the expected sign (positive or negative);** * **the t-statistics are significant, i.e., greater than 1.64 (significance level less than .10);** * **the F-statistic is "large" and the probability provided with the F-statistic should be less than .05;** * **the standard error of the estimate or SEE (also termed the "root mean square error" or RMSE) should be small;** * **the Coefficient of Variation should be small; and** * **the adjusted R2 should be large.**

Answer 8

**Overall, our model appears to reasonably approximate sale price:** * **the coefficient is +$72.08, which makes intuitive sense - as living area increases, price increases;** * **the t-statistic is good at 13.093 (significance level is .000);** * **the F-statistic is large at 171 and the associated significance is .000;** * **the SEE of $9,556 is small relative to the $76,593 mean;** * **the COV is 12.48%, which is acceptable but larger than optimal; and** * **the adjusted R² at 0.59 is reasonably large, but not great, as 40% of variation in sale price remains unexplained.**

Answer 9

**With all of these statistics indicating positive results, it appears we can conclude this is a good model to estimate the selling price of condominiums in this market area. However, there is one more element of the model that needs to be checked before we can make this claim; we must examine for multicollinearity.** **When we created the simple regression model between sale price and living area we were only concerned about the one relationship between sales price and the living area. In creating our more complex model, we must consider the relationship between sale price and each of living area, floor number, and bathrooms, but also the relationships among the independent variables - that is, how living area and bathrooms, living area and floor number, and bathrooms and floor number may be related. If any of these three combinations show any significant correlation, then we have multicollinearity in our model.** **The existence of high multicollinearity can invalidate an MRA model. This is because the overlap in the variables will cause the MRA process to become "confused" and the values of the coefficients will be inaccurate.**

Answer 10

**The first part of multicollinearity testing should be done during data screening, prior to running the regression (as will be seen in the following lessons). The second part of this testing should be done after the model is generated. The part that can be done beforehand is the examination of the correlation matrix. As can be seen in the Correlation table included in the regression results, our three variables have the following correlations:** * **living area and bathrooms 0.370** * **living area and floor number -0.090** * **bathrooms and floor number 0.411**

Answer 11

**Variables with correlations over ±0.500 should be closely examined, although generally only those over ±0.800 will cause problems in an MRA model. At the outset of specifying a model, variables with correlations over ±0.800 should not be placed in the same model. In our case, the correlations are all low enough not to be of concern.**

Answer 12

**The second measure for multicollinearity is generated when the model is created, in the Tolerance and VIF (variance inflation factor) statistics. These two statistics measure the same thing, as they are inversely related; that is, Tolerance = 1 + VIF. The Tolerance should be greater than 0.3 (and the VIF less than 3.333).** **A variable that has a tolerance value less than the target of 0.3 is considered to show a degree of multicollinearity which can have a serious effect on the value of its coefficient. A modeler must be wary to watch for low tolerance (high VIF) as the coefficients may be inaccurate.** **In our case, the tolerances are:** * **living area 0.793** * **floor number 0.763** * **bathrooms 0.664** **All are greater than the critical value and indicate no multicollinearity. We can safely conclude that we have produced a good model to estimate the selling price of condominiums in this market area.**

BUSI344 / CHAPTER 6 QUESTIONS 2 Flashcards

(36 cards)