Categorical analysis 2 Flashcards

1
Q

What should you use if you are comparing two nominal variables?

A

Chi-square test of association or test of independence

tests if two nominal-scale variables are related to each other

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is ‘effect size’?

A

The outcome of a hypothesis depends on the sample size (larger better)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Why is it recommended that an independent measure of effect size be used when reporting a significant statistical effect?

A

Small treatment effect can be statistically significant if the sample is large enough

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does the effect size estimate a metric provide information about?

A

The size of an effect that is not influenced by factors such as sample size

Measures how ‘big’ the difference between the data and the null hypothesis predictions actually were

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What does ‘cramer’s v’ do?

A

Measures effect size in categorical analysis (chi-square)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the R command for ‘Cramers v’?

A

associationTest() prints it automatically but can also use cramersV() for it directly

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How should you roughly interpret Cramer’s V?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Why are assumptions necessary in a test?

A

Necessary to allow inference

If assumptions are wrong though, you can make mistakes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Is sampling distribution equal to ‘chi-square’ in chi-square tests?

A

No, only approximately

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What assumptions do both chi-square tests (‘goodness of fit’ and ‘association’) make?

A

‘Large’ expected frequencies

Independence of the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are ‘large’ expected frequencies an assumption of chi-square tests?

A

Data only becomes chi-square if we can presume that there are enough observations for the underlying binominal distributions to be ‘normal’

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What test should you use for comparing nominal variables if frequencies are too small?

A

Fisher Exact Test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the Fisher Exact Test?

A

An analogue of the chi-square test of association

However, it doesn’t require large expected frequencies (works best for small frequencies)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What assumptions does the Fisher Exact Test make that the chi-square test of association doesn’t?

A

It assumes that row and column totals are fixed

(can’t be changed and are the same number)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How does the Fisher Exact Test work?

A

By calculating the exact probability of obtaining a particular contingency table (i.e. cross-tabulation)

  • The p-value is calculated by summing over all contingency tables that are “more extreme” than the observed one.*
  • The definition of “more extreme” is tricky, but basically means “more uneven”*
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the main thing to note in the Fisher Exact Test when looking at the results?

A

p-value

17
Q

What does the second assumption of chi-square tests ‘independence of data’ mean?

A

Can’t have any ‘special relationship’ among some of your observations

(e.g. same people participating in two of the same experiments)

18
Q

What test should you use to analyse categorical data if the two sets of data are not independent of one another?

A

McNemar test

19
Q

What is McNemar’s ‘limited solution to a standard problem’?

Describe the problem and his solution.

A

What do you do when you have multiple observations from each person? (e.g. pre-test and post-test)

Can’t use chi-square because this violates the independence assumption

Solution:

You have a binary outcome measure (e.g. yes or no) and you measure it twice (e.g. pre- and post-)

20
Q

McNemar’s test is testing which two ‘cells’ in the cross-tabulation before answers (yes, no) and after answers (yes, no)?

A
21
Q

How do you do McNemar’s test in R?

A

allAds <- xtabs(~after+before, data=ads)

mcnemar.test(x=allAds)