Non-Parametric Tests Flashcards

(38 cards)

1
Q

Problems with parametric tests

A

Strong assumptions e.g. Normality e.g. N large enough to invoke CLT.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are non-parametric tests?

A

Tests valid over a wide range of distributions and can be carried out making far fewer assumptions about the random variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the most simple non parametric test called?

A

Wilcoxon Sign test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What type of data does the sign test analyse?

A

Matched pairs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Briefly describe the process of setting up sign test

A

Assign + if 1st value > 2nd
- if 1st value < 2nd
Construct a Bernoulli trial for each individual
Under H0, p=0.5. Repeated Bernoulli = binomial
W ~ B(n, 0.5). P (W is what we observe)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

For sign test how do we calculate the p value if the test is two sided?

A

We work out probability using binomial e.g. 9C0 (0.5)^9 + 9C1 (0.5)^9 = 0.02
P value for 2 sided test = 2 x 0.02 = 0.04
0.04 < 0.05 (alpha) therefore we reject H0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

For a sign test if n>25, how do we work out the probability?

A

Invoke CLT so W bar is approx normal with M=0.5 and sigma^2 = 0.5^2 / n.

Z = (W/n - 0.5) / (sqrt 0.5^2 / n)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Problem with sign test

A

Ignores magnitude - treats a large negative difference the same as a small negative difference. Collapse everything to 0 or 1 = lots of information thrown away = low powered test when n is small. More likely to make a type 2 error of accepting H0 when it’s false. So we often find an insignificant test statistic.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What do we do with zero differences in the sign test?

A

We discard them and then reduced n by 1.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

The sign test & sign rank test are only applicable for…

A

Matched pairs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How does the sign rank test differ from sign test?

A

It accounts for magnitude of the difference as well as sign

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Describe sign rank test

A

Rank absolute differences in ascending order of magnitude
If two values have the same magnitude, assign the average rank
Sum up R+ and R- separately

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is our test statistic for the sign rank test?

A

T = MIN {R+, R-}

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Under H0 for the sign rank test, what is E(T) and V(T)

A
E(T) = n(n+1)/4
V(T) = n(n+1)(2n+1)/24
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What n<25, how do we work out our CVs for the sign rank test?

A

Use the tables given in the formula sheet. Correct value of alpha dependent on 1/2/ sided test.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

When do we reject H0 for the sign rank test? Why?

A

If our test statistic < CV

As we are minimising

17
Q

If n>25, what do we do for the sign rank test?

A

Invoke CLT = approx normality

Our test statistic is given by [T - n(n+1)]/4 / sqrt [n(n+1)(2n+1)/24]

18
Q

Limitation of sign rank test

A

Ignores spread of data - if highest absolute difference is 2, given rank n. If highest is 100, still given rank n. This may compress or stretch data. Less powerful than a parametric test, but more than just the sign test.

19
Q

When n>25 for sign rank test, when do we reject?

A

Reject if p value is less than the significance level (same as usual hypothesis testing)

20
Q

When is the Mann-Whitney test applicable?

A

We can use it even if we don’t have matched pairs. Use for independent random samples for difference in means.

21
Q

How do we rank equal magnitudes in the sign rank test?

A

Average rank e.g. If two numbers are to be ranked 4&5, give the, both rank 4+5/2 = 4.5

22
Q

Describe the Mann Whitney test

A

Rank all observations n1 + n2 but preserve the colour
Equal values given an average rank
Sum of R1
Work out U(see formula sheet), E(U) & V(U) and then test statistic
If n>25, approx normal.

23
Q

When do we reject for Mann Whitney test?

A

If n>25, approx normal = Z test
Double p value
Reject if p value is less than the significant level

24
Q

When do we use goodness of fit test?

A

Where we have discrete outcomes into k categories (can also use for continuous data but need to put into discrete categories first)

25
Describe goodness of fit test
Calculate Ei = npi for each category K See formula sheet to calculate test statistic using simpler version Follows a chi squared distribution. Reject if test stat > CV given by chi squared
26
When do we reject H0 for goodness of fit test?
If the test statistic is greater than the CV given by chi squared distribution
27
What distribution do we get the CV from for a goodness of fit test?
Chi squared
28
DOF for CV for goodness of fit test =
DOF = K - 1 where K is the number of different categories
29
What is a condition for the goodness of fit test to be appropriate? How can we solve it?
Ei should not be <5 for any category; if it is aggregate two categories.
30
What is H0 usually for goodness of fit test ?
H0 = all outcomes equally likely So Pi = 1/k Ei = n/k for each category
31
What data are contingency tables used for?
Where we have a two way table with K categories in A and H in B, so we have KH cross classifications.
32
Why don't we use hypothesis testing or ANOVA instead of contingency tables?
Hypothesis tests limited to two groups | ANOVA allows >2 groups but requires assumption of normality.
33
What are contingency tables another form of?
Goodness of fit test but with a two way table rather than one
34
What is H0 for contingency table?
``` H0 = variables are not related H1 = variables are related ```
35
How do we work out the expected values for contingency tables?
Eij = n pij Since under H0 variables are independent, pij = p(i) x p(j)
36
What distribution do we get CVs from for contingency tables?
Chi squared
37
How do we work out degrees of freedom for contingency tables?
DOF = (r - 1)(c - 1) r=Number of rows c=number of columns
38
When do we reject H0 for contingency tables?
If test statistic is greater than CV given by chi squared distribution