Flashcards in Hypothesis testing and tests in general Deck (15):

1

## What is the F-statistic (In words)

### How much more the full model fits the data than the reduced model. This is done by calculating a ratio between the RSS of the full model and that of the reduced model since the lower the RSS the better the model fits the data.

2

## What is the value of the F-statitic (A formula)

###
((RSSr − RSSf )/(pf − pr))/

(RSSf /(n − pf ))

where pf is the length of the vector β in the full model

and pr is the length of the vector β in the reduced model.

3

## What distribution does the F statistic follow?

###
The F distribution surprise surprise

F ~ F (pf - pr, n - pf)

Note the parameters are those that are used in the F-statistic.

4

## For a test of size 0.05, when do we reject the reduced model in favour of the full?

### If F > F(pf - pr, n - pf)(0.95)

5

## What is the t-test used for?

### Seeing whether one value of β1 is better than another value of β. (β*)

6

## What is the statistic used the t-test?

### T =β1 − β∗/(σˆ2/Sxx)

7

## What is the p-value for a two-sided t-test?

### 2×P(tn-2 ≥ |T|)

8

## Why can the t-test be used instead of an F test?

### When testing β1 =0 T ~ Tn-2 F ~ F(pf -pr, n -pf)

9

## What is the R-squared statistic used for?

###
How much the model fits the given data.

The proportion of the total variation in the data explained by the regression fit

10

## What is the R-squared statistic's formula?

###
Pni=1(^yi − y¯)^2/

Pni=1(yi − y¯)^2

Pni = sum from i = 1 to n

11

## What is the adjusted R-squared statistic?

###
The R-Squared stastic that takes into consideration the number of data points

1 − [(1 − R^2)×(n − 1)/(n − p − 1)]

12

## What is the kolmogorov-smirnov test for?

###
for testing that two samples of data

have come from the same distribution

13

## Which 3 ways can we test for normality in the residuals?

###
1. Histogram

2. Kolmogorov-Smirnov test

3. Q-Q plot

14

## How does the Residual plot tell us whether the residuals are normal or not?

### Cov(e,y) = 0 so the residuals and the fitted values should be uncorrelated so if we plot them against each other, we should see random scatter around the line x - 0 with most of the values falling between +/- 2. This also suggest non-constant variance

15