Flashcards in Hypothesis testing and tests in general Deck (15):
What is the F-statistic (In words)
How much more the full model fits the data than the reduced model. This is done by calculating a ratio between the RSS of the full model and that of the reduced model since the lower the RSS the better the model fits the data.
What is the value of the F-statitic (A formula)
((RSSr − RSSf )/(pf − pr))/
(RSSf /(n − pf ))
where pf is the length of the vector β in the full model
and pr is the length of the vector β in the reduced model.
What distribution does the F statistic follow?
The F distribution surprise surprise
F ~ F (pf - pr, n - pf)
Note the parameters are those that are used in the F-statistic.
For a test of size 0.05, when do we reject the reduced model in favour of the full?
If F > F(pf - pr, n - pf)(0.95)
What is the t-test used for?
Seeing whether one value of β1 is better than another value of β. (β*)
What is the statistic used the t-test?
T =β1 − β∗/(σˆ2/Sxx)
What is the p-value for a two-sided t-test?
2×P(tn-2 ≥ |T|)
Why can the t-test be used instead of an F test?
When testing β1 =0 T ~ Tn-2 F ~ F(pf -pr, n -pf)
What is the R-squared statistic used for?
How much the model fits the given data.
The proportion of the total variation in the data explained by the regression fit
What is the R-squared statistic's formula?
Pni=1(^yi − y¯)^2/
Pni=1(yi − y¯)^2
Pni = sum from i = 1 to n
What is the adjusted R-squared statistic?
The R-Squared stastic that takes into consideration the number of data points
1 − [(1 − R^2)×(n − 1)/(n − p − 1)]
What is the kolmogorov-smirnov test for?
for testing that two samples of data
have come from the same distribution
Which 3 ways can we test for normality in the residuals?
2. Kolmogorov-Smirnov test
3. Q-Q plot
How does the Residual plot tell us whether the residuals are normal or not?
Cov(e,y) = 0 so the residuals and the fitted values should be uncorrelated so if we plot them against each other, we should see random scatter around the line x - 0 with most of the values falling between +/- 2. This also suggest non-constant variance