Topic 11 - Tests for Relationship Flashcards

1
Q

L.O.

A

LO7 Given real multivariate data and a problem, formulate an appropriate hypothesis and perform a range of hypothesis tests.
LO8 Interpret the p-value, conscious of the various pitfalls associated with testing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Chi-Squared tests (x^2)

A

Used for:

Goodness of Fit:
- tests whether the observed frequency distribution of a categorical variable matches an expected theoretical distribution
eg. Do eye colours of DATA1001 students follow; 45% brown, 27% blue 28% green?

Independence:
- Examines whether there is a significant association between 2 qualitative variables
eg. Is there an association between a persons eye colour, and their parents?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

X^2 GoF Vs Independence

A

GoF:
- One variable
- Compare observed data to theoretical distribtion

Independence:
- Two variables
- Examine the relationship between variables within the same population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Chi Squared test stat.

A

test stat = (OF - EF)^2 ÷ EF

Observed & Expected Frequencies

X^2 = sum of all the test statistics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

x^2 GoF HATPC process

A

H:
H0 = assumes that any differences between OF& EF is due to chance alone
H1 = NOT due to chance alone

A:
- Observations are independent
- EF: none are empty and no more than 20% are < 5

T:
test stats = (O-E)^2 ÷ E

x^2 = Σ(O-E)^2 ÷ E

P:
Uses a x^2 distibution:
DoF = k-1
k = # of categories

eg. DoF = 6-1
=5
[heft]

C:
If p> 0.05, the data is consistent with H0 and H0 is retained

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

x^2 Independence HATPC Example

A

H:
H0 = withdrawal symptoms severity IS independent of belief in caffeine consumption
H1 = severity is NOT independent of befief of consumption, there is an association between betief and severity

A:
- Cochrans Rule; (No more than 20% < 5 and no EF empty)
- Observations are independent

T:
DoF = (m-1)(n-1)
where,
m= # categories in variable 1
n = # categories in variable 2

P:
The chance of observing X^2 value or more extreme on a 2DoF x^2 distribution
[heft]

C:
P > 0.05 = Retain H0, there is NO association
P < Reject H0, there IS association

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Mosaic plot with standartised residuals

A
  • The residuals indicate the ‘gaps’ (O-E) for each individual combination

[heft]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

One-sample t-test for the scope: (Regression Test)

A

[heft] for equation

H:
H0 = There is a linear trend
H1 = There is NO linear trend

A:
- Independence of residuals (check context)
- Normality of residuals (QQplot, Shapiro-Wilk)
- Homoscedasicity of residuals (residual plot)
- Linearity of variables (scatterplot, residual plot)

T:
T = (OV - EV) ÷ SE
In R: summary()

C:
P > Retain H0
P < Reject H0, Slope is significantly different from 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly